date:20120602


[ 
https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287878#comment-13287878
 ] 

Anoop Sam John commented on HBASE-5974:
---

If the solution is fine with every one I can make patch for other versions also.

@Ted
Regarding the new Exception extending DoNotRetryIOException, I was following 
NSRE.  I can make this change.  I think it should be ok.

 Scanner retry behavior with RPC timeout on next() seems incorrect
 -

 Key: HBASE-5974
 URL: https://issues.apache.org/jira/browse/HBASE-5974
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0
Reporter: Todd Lipcon
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.94.1

 Attachments: HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch


 I'm seeing the following behavior:
 - set RPC timeout to a short value
 - call next() for some batch of rows, big enough so the client times out 
 before the result is returned
 - the HConnectionManager stuff will retry the next() call to the same server. 
 At this point, one of two things can happen: 1) the previous next() call will 
 still be processing, in which case you get a LeaseException, because it was 
 removed from the map during the processing, or 2) the next() call will 
 succeed but skip the prior batch of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5936) Add Column-level PB-based calls to HMasterInterface

2012-06-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287896#comment-13287896
 ] 

Hudson commented on HBASE-5936:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #37 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/37/])
HBASE-5936 Addendum adds changes for TestHMasterRPCException that were 
missed in previous checkin (Revision 1345441)

 Result = FAILURE
tedyu : 
Files : 
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestHMasterRPCException.java


 Add Column-level PB-based calls to HMasterInterface
 ---

 Key: HBASE-5936
 URL: https://issues.apache.org/jira/browse/HBASE-5936
 Project: HBase
  Issue Type: Task
  Components: ipc, master, migration
Reporter: Gregory Chanan
Assignee: Gregory Chanan
 Fix For: 0.96.0

 Attachments: 5936-addendum-v2.txt, HBASE-5936-v3.patch, 
 HBASE-5936-v4.patch, HBASE-5936-v4.patch, HBASE-5936-v5.patch, 
 HBASE-5936-v6.patch, HBASE-5936.patch


 This should be a subtask of HBASE-5445, but since that is a subtask, I can't 
 also make this a subtask (apparently).
 This is for converting the column-level calls, i.e.:
 addColumn
 deleteColumn
 modifyColumn

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6138) HadoopQA not running findbugs [Trunk]

2012-06-02 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287897#comment-13287897
 ] 

Hudson commented on HBASE-6138:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #37 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/37/])
HBASE-6138 HadoopQA not running findbugs [Trunk] (Anoop Sam John) (Revision 
1345391)

 Result = FAILURE
tedyu : 
Files : 
* /hbase/trunk/pom.xml


 HadoopQA not running findbugs [Trunk]
 -

 Key: HBASE-6138
 URL: https://issues.apache.org/jira/browse/HBASE-6138
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.96.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.96.0

 Attachments: 6138.txt


 HadoopQA shows like
  -1 findbugs.  The patch appears to cause Findbugs (version 1.3.9) to fail.
 But not able to see any reports link
 When I checked the console output for the build I can see
 {code}
 [INFO] --- findbugs-maven-plugin:2.4.0:findbugs (default-cli) @ hbase-common 
 ---
 [INFO] Fork Value is true
 [INFO] 
 
 [INFO] Reactor Summary:
 [INFO] 
 [INFO] HBase . SUCCESS [1.890s]
 [INFO] HBase - Common  FAILURE [2.238s]
 [INFO] HBase - Server  SKIPPED
 [INFO] HBase - Assembly .. SKIPPED
 [INFO] HBase - Site .. SKIPPED
 [INFO] 
 
 [INFO] BUILD FAILURE
 [INFO] 
 
 [INFO] Total time: 4.856s
 [INFO] Finished at: Thu May 31 03:35:35 UTC 2012
 [INFO] Final Memory: 23M/154M
 [INFO] 
 
 [ERROR] Could not find resource 
 '${parent.basedir}/dev-support/findbugs-exclude.xml'. - [Help 1]
 [ERROR] 
 {code}
 Because of this error Findbugs is getting run!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5923) Cleanup checkAndXXX logic

2012-06-02 Thread Lars Hofhansl (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287911#comment-13287911
]

Lars Hofhansl commented on HBASE-5923:
--

That works. The other problem is o.a.h.h.Filter.WritableByteArrayComparable.
I thought I could move this to o.a.h.h.BaseWritableByteArrayComparable and have
o.a.h.h.Filter.WritableByteArrayComparable be a no-op subclass, but that would
change the wire protocol :(

Initially I thought one could just always BinaryComparator, but especially for
LESS/GREATER type operations it is important to be able to control the sort
order (for example for Unicode).

It seems I'm stumped. Either o.a.h.h.Filter.WritableByteArrayComparable has to
leak up into HTableInterface, or the wire protocol changes.

Cleanup checkAndXXX logic
-

Key: HBASE-5923
URL: https://issues.apache.org/jira/browse/HBASE-5923
Project: HBase
Issue Type: Improvement
Components: client, regionserver
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
Fix For: 0.96.0, 0.94.1

Attachments: 5923-0.94.txt, 5923-trunk.txt

1. the checkAnd{Put|Delete} method that takes a CompareOP is not exposed via
HTable[Interface].
2. there is unnecessary duplicate code in the check{Put|Delete} code in
HRegionServer.

[jira] [Commented] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect

2012-06-02 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287916#comment-13287916
 ] 

Lars Hofhansl commented on HBASE-5974:
--

+1 for V2

 Scanner retry behavior with RPC timeout on next() seems incorrect
 -

 Key: HBASE-5974
 URL: https://issues.apache.org/jira/browse/HBASE-5974
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0
Reporter: Todd Lipcon
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.94.1

 Attachments: HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch


 I'm seeing the following behavior:
 - set RPC timeout to a short value
 - call next() for some batch of rows, big enough so the client times out 
 before the result is returned
 - the HConnectionManager stuff will retry the next() call to the same server. 
 At this point, one of two things can happen: 1) the previous next() call will 
 still be processing, in which case you get a LeaseException, because it was 
 removed from the map during the processing, or 2) the next() call will 
 succeed but skip the prior batch of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6059) Replaying recovered edits would make deleted data exist again

2012-06-02 Thread Lars Hofhansl (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287917#comment-13287917
 ] 

Lars Hofhansl commented on HBASE-6059:
--

I don't grok the patch in all detail, but looks good, and same as trunk patch. 
So +1.
@Stack: Maybe you can have a safety look...?

 Replaying recovered edits would make deleted data exist again
 -

 Key: HBASE-6059
 URL: https://issues.apache.org/jira/browse/HBASE-6059
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: chunhui shen
Assignee: chunhui shen
 Fix For: 0.96.0

 Attachments: 6059v6.txt, 6059v7-94.patch, 6059v7.txt, 6059v7.txt, 
 HBASE-6059-testcase.patch, HBASE-6059.patch, HBASE-6059v2.patch, 
 HBASE-6059v3.patch, HBASE-6059v4.patch, HBASE-6059v5.patch


 When we replay recovered edits, we used the minSeqId of Store, It may cause 
 deleted data appeared again.
 Let's see how it happens. Suppose the region with two families(cf1,cf2)
 1.put one data to the region (put r1,cf1:q1,v1)
 2.move the region from server A to server B.
 3.delete the data put by step 1(delete r1)
 4.flush this region.
 5.make major compaction for this region
 6.move the region from server B to server A.
 7.Abort server A
 8.After the region is online, we could get the deleted data(r1,cf1:q1,v1)
 (When we replay recovered edits, we used the minSeqId of Store, because cf2 
 has no store files, so its seqId is 0, so the edit log of put data will be 
 replayed to the region)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect


[ 
https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287928#comment-13287928
 ] 

ramkrishna.s.vasudevan commented on HBASE-5974:
---

+1 from me too.  

 Scanner retry behavior with RPC timeout on next() seems incorrect
 -

 Key: HBASE-5974
 URL: https://issues.apache.org/jira/browse/HBASE-5974
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0
Reporter: Todd Lipcon
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.94.1

 Attachments: HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch


 I'm seeing the following behavior:
 - set RPC timeout to a short value
 - call next() for some batch of rows, big enough so the client times out 
 before the result is returned
 - the HConnectionManager stuff will retry the next() call to the same server. 
 At this point, one of two things can happen: 1) the previous next() call will 
 still be processing, in which case you get a LeaseException, because it was 
 removed from the map during the processing, or 2) the next() call will 
 succeed but skip the prior batch of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6046) Master retry on ZK session expiry causes inconsistent region assignments.


[ 
https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287932#comment-13287932
 ] 

ramkrishna.s.vasudevan commented on HBASE-6046:
---

If the patch is ok can i prepare for trunk also.  I can fix the comment on 
commit.  
@Stack
Pls reivew and provide your comments on this.

 Master retry on ZK session expiry causes inconsistent region assignments.
 -

 Key: HBASE-6046
 URL: https://issues.apache.org/jira/browse/HBASE-6046
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.92.1, 0.94.0
Reporter: Gopinathan A
Assignee: ramkrishna.s.vasudevan
 Attachments: HBASE_6046_0.94.patch, HBASE_6046_0.94_1.patch, 
 HBASE_6046_0.94_2.patch


 1 ZK Session timeout in the hmaster leads to bulk assignment though all the 
 RSs are online.
 2 While doing bulk assignment, if the master again goes down  restart(or 
 backup comes up) all the node created in the ZK will now be tried to reassign 
 to the new RSs. This is leading to double assignment.
 we had 2800 regions, among this 1900 region got double assignment, taking the 
 region count to 4700. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover

[
https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287934#comment-13287934
]

ramkrishna.s.vasudevan commented on HBASE-6060:
---

+1 on v4. Hope test suite passes.

Regions's in OPENING state from failed regionservers takes a long time to
recover
-

Key: HBASE-6060
URL: https://issues.apache.org/jira/browse/HBASE-6060
Project: HBase
Issue Type: Bug
Components: master, regionserver
Reporter: Enis Soztutar
Assignee: Enis Soztutar
Attachments: 6060-94-v3.patch, 6060-94-v4.patch, HBASE-6060-94.patch

we have seen a pattern in tests, that the regions are stuck in OPENING state
for a very long time when the region server who is opening the region fails.
My understanding of the process:

- master calls rs to open the region. If rs is offline, a new plan is
generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in
master memory, zk still shows OFFLINE). See HRegionServer.openRegion(),
HMaster.assign()
- RegionServer, starts opening a region, changes the state in znode. But
that znode is not ephemeral. (see ZkAssign)
- Rs transitions zk node from OFFLINE to OPENING. See
OpenRegionHandler.process()
- rs then opens the region, and changes znode from OPENING to OPENED
- when rs is killed between OPENING and OPENED states, then zk shows OPENING
state, and the master just waits for rs to change the region state, but since
rs is down, that wont happen.
- There is a AssignmentManager.TimeoutMonitor, which does exactly guard
against these kind of conditions. It periodically checks (every 10 sec by
default) the regions in transition to see whether they timedout
(hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min,
which explains what you and I are seeing.
- ServerShutdownHandler in Master does not reassign regions in OPENING
state, although it handles other states.
Lowering that threshold from the configuration is one option, but still I
think we can do better.
Will investigate more.

[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover

[
https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287937#comment-13287937
]

Zhihong Yu commented on HBASE-6060:
---

Test suite passed.

Regions's in OPENING state from failed regionservers takes a long time to
recover
-

we have seen a pattern in tests, that the regions are stuck in OPENING state
for a very long time when the region server who is opening the region fails.
My understanding of the process:

[jira] [Commented] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect


[ 
https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287949#comment-13287949
 ] 

Zhihong Yu commented on HBASE-5974:
---

HRegionInterface.java doesn't exist in trunk so patch v2 wouldn't apply to 
trunk.
I would suggest creating patch for trunk and run through hadoop QA.
{code}
+  LOG.info(Seq number based scan API not present at RS side! 
Trying with API: 
{code}
I think the above log should be at warn level.
{code}
+} else if (ioe instanceof CallSequenceOutOfOrderException) {
+  // The callSeq from the client not matched with the one expected at 
the RS side
+  // This means the RS might have done extra scanning of data which is 
not received by the
+  // client.Throw a DNRE so that we close the current scanner and 
opens a new one with RS.
+  throw new DoNotRetryIOException(Reset scanner, ioe);
{code}
Should we disclose a little more detail in the message of DNRIOE ? The above is 
the same as response to NotServingRegionException and 
RegionServerStoppedException.
'not matched with' - 'does not match'
'is not received' - 'has not been received'
'opens a new' - 'open a new'
{code}
+// if callSeq do not match throw Exception straight away. This needs to be 
performed even
{code}
'do not match' - 'does not match'
{code}
+public class TestClientScannerRPCTimesout {^M
{code}
Please add short javadoc for the test class. I think it should be called 
TestClientScannerRPCTimeout.
Please use utility such as dos2unix to remove the trailing ^M from the patch 
file.
{code}
+  public static class RegionServerWithScanTimesout extends 
MiniHBaseClusterRegionServer {^M
{code}
The above class can be made private. It should be named 
RegionServerWithScanTimeout.
{code}
+ * Thrown by a region server while scan related next() calls. Both client and 
server maintain a^M
+ * callSequence and if the both do not match, RS will throw this exception.^M
+ */^M
+public class CallSequenceOutOfOrderException extends IOException {^M
{code}
CallSequenceOutOfOrderException should extend DoNotRetryIOException so that we 
don't need to create DoNotRetryIOException instance (shown above).
'while scan related next()' - 'while doing scan related next()'
'the both do not' - 'they do not'

It would be nice for Todd to take a look at the patch.

 Scanner retry behavior with RPC timeout on next() seems incorrect
 -

 Key: HBASE-5974
 URL: https://issues.apache.org/jira/browse/HBASE-5974
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0
Reporter: Todd Lipcon
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.94.1

 Attachments: HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch


 I'm seeing the following behavior:
 - set RPC timeout to a short value
 - call next() for some batch of rows, big enough so the client times out 
 before the result is returned
 - the HConnectionManager stuff will retry the next() call to the same server. 
 At this point, one of two things can happen: 1) the previous next() call will 
 still be processing, in which case you get a LeaseException, because it was 
 removed from the map during the processing, or 2) the next() call will 
 succeed but skip the prior batch of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6152) Split abort is not handled properly

[
https://issues.apache.org/jira/browse/HBASE-6152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

ramkrishna.s.vasudevan updated HBASE-6152:
--

Attachment: HBASE-6152_0.94.patch

Test case to reproduce this issue. Infact this will happen in 0.92.0 and
0.92.1 and not in the latest code in 0.92 or 0.94.
Now the current code
{code}
if (rs.isSplit() || rs.isSplitting()) {
{code}
does not have this line. So it should not create a problem here. It was
removed as part of HBASE-6070. Prior to this it could have happened. The same
has been reproduced in the testcase.

Split abort is not handled properly
---

Key: HBASE-6152
URL: https://issues.apache.org/jira/browse/HBASE-6152
Project: HBase
Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Devaraj Das
Assignee: Devaraj Das
Attachments: HBASE-6152_0.94.patch

I ran into this:
1. RegionServer started to split a region(R), but the split was taking a long
time, and hence the split was aborted
2. As part of cleanup, the RS deleted the ZK node that it created initially
for R
3. The master (AssignmentManager) noticed the node deletion, and made R
offline
4. The RS recovered from the failure, and at some point of time, tried to do
the split again.
5. The master got an event RS_ZK_REGION_SPLIT but the server gave an error
like - Received SPLIT for region R from server RS but it doesn't exist
anymore,..
6. The RS apparently did the split successfully this time, but is stuck on
the master to delete the znode for the region. It kept on saying -
org.apache.hadoop.hbase.regionserver.SplitTransaction: Still waiting on the
master to process the split for R and it was stuck there forever.

[jira] [Commented] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect


[ 
https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287953#comment-13287953
 ] 

Zhihong Yu commented on HBASE-5974:
---

w.r.t. keeping RegionScannerHolder, I posted a poll to dev@hbase for use case 
of letting [pre,post]ScannerOpen() return a custom RegionScanner implementation.

 Scanner retry behavior with RPC timeout on next() seems incorrect
 -

 Key: HBASE-5974
 URL: https://issues.apache.org/jira/browse/HBASE-5974
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0
Reporter: Todd Lipcon
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.94.1

 Attachments: HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch


 I'm seeing the following behavior:
 - set RPC timeout to a short value
 - call next() for some batch of rows, big enough so the client times out 
 before the result is returned
 - the HConnectionManager stuff will retry the next() call to the same server. 
 At this point, one of two things can happen: 1) the previous next() call will 
 still be processing, in which case you get a LeaseException, because it was 
 removed from the map during the processing, or 2) the next() call will 
 succeed but skip the prior batch of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6151) Master can die if RegionServer throws ServerNotRunningYet


[ 
https://issues.apache.org/jira/browse/HBASE-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287971#comment-13287971
 ] 

Zhihong Yu commented on HBASE-6151:
---

ServerNotRunningYetException should be handled in the last catch block of 
getCachedConnection():
{code}
} catch (IOException ioe) {
{code}

 Master can die if RegionServer throws ServerNotRunningYet
 -

 Key: HBASE-6151
 URL: https://issues.apache.org/jira/browse/HBASE-6151
 Project: HBase
  Issue Type: Bug
  Components: ipc
Affects Versions: 0.90.7, 0.92.2, 0.96.0, 0.94.1
Reporter: Gregory Chanan
Assignee: Gregory Chanan

 See, for example:
 {noformat}
 2012-05-23 16:49:22,745 FATAL org.apache.hadoop.hbase.master.HMaster: 
 Unhandled exception. Starting shutdown.
 org.apache.hadoop.hbase.ipc.ServerNotRunningException: 
 org.apache.hadoop.hbase.ipc.ServerNotRunningException: Server is not running 
 yet
   at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038)
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
   at 
 org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:96)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1240)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:444)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:343)
   at 
 org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:540)
   at 
 org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:474)
   at 
 org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:412)
 {noformat}
 The HRegionServer calls HBaseServer:
 {code}
   public void start() {
 startThreads();
 openServer();
   }
 {code}
 but the server can start accepting RPCs once the threads have been started, 
 but if they do, they throw ServerNotRunningException until openServer runs.  
 We should probably
 1) Catch the remote exception and retry on the master
 2) Look into whether the start() behavior of HBaseServer makes any sense.  
 Why would you start accepting RPCs only to throw back 
 ServerNotRunningException?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem


 [ 
https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-6067:
--

Attachment: 6067.txt

Patch v1 introduces reflection to detect the presence of 
getDefaultBlockSize(Path f)

TestHLog passes.

 HBase won't start when hbase.rootdir uses ViewFileSystem
 

 Key: HBASE-6067
 URL: https://issues.apache.org/jira/browse/HBASE-6067
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Eli Collins
Assignee: Eli Collins
 Attachments: 6067.txt


 HBase currently doesn't work with HDFS federation (hbase.rootdir with a 
 client that uses viewfs) because HLog#init uses 
 FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an 
 exception because there is no default filesystem in a viewfs client so 
 there's no way to determine a default block size or replication factor. They 
 could use the versions of these methods that take a path, however these were 
 introduced in HADOOP-8014 and are not yet available in Hadoop 1.x.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem


 [ 
https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-6067:
--

Status: Patch Available  (was: Open)

 HBase won't start when hbase.rootdir uses ViewFileSystem
 

 Key: HBASE-6067
 URL: https://issues.apache.org/jira/browse/HBASE-6067
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Eli Collins
Assignee: Eli Collins
 Attachments: 6067.txt


 HBase currently doesn't work with HDFS federation (hbase.rootdir with a 
 client that uses viewfs) because HLog#init uses 
 FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an 
 exception because there is no default filesystem in a viewfs client so 
 there's no way to determine a default block size or replication factor. They 
 could use the versions of these methods that take a path, however these were 
 introduced in HADOOP-8014 and are not yet available in Hadoop 1.x.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect


[ 
https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287974#comment-13287974
 ] 

Anoop Sam John commented on HBASE-5974:
---

@Ted
Will look into your comments and need a rebase for 94 patch too..
I will make seperate patch fro trunk also.

wrt pre and post CP hooks, it is not only creating a new custom RegionScanner, 
but may be creating a wrapper for the actual RegionScanner.  In one of our 
impl, we use this approach. [Just a wrapper which delegates the calls with some 
extra steps for next() calls. ]  Here also if we add new methods to 
RegionScanner interface which deals with the check and incerement for this 
seqNo, will get exposed to user. I felt this might look odd for them. 

 Scanner retry behavior with RPC timeout on next() seems incorrect
 -

 Key: HBASE-5974
 URL: https://issues.apache.org/jira/browse/HBASE-5974
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0
Reporter: Todd Lipcon
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.94.1

 Attachments: HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch


 I'm seeing the following behavior:
 - set RPC timeout to a short value
 - call next() for some batch of rows, big enough so the client times out 
 before the result is returned
 - the HConnectionManager stuff will retry the next() call to the same server. 
 At this point, one of two things can happen: 1) the previous next() call will 
 still be processing, in which case you get a LeaseException, because it was 
 removed from the map during the processing, or 2) the next() call will 
 succeed but skip the prior batch of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem

2012-06-02 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287984#comment-13287984
]

Hadoop QA commented on HBASE-6067:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12530653/6067.txt
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

+1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

+1 core tests. The patch passed unit tests in .

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/2088//testReport/
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/2088//console

This message is automatically generated.

HBase won't start when hbase.rootdir uses ViewFileSystem

Key: HBASE-6067
URL: https://issues.apache.org/jira/browse/HBASE-6067
Project: HBase
Issue Type: Improvement
Components: regionserver
Reporter: Eli Collins
Assignee: Eli Collins
Attachments: 6067.txt

HBase currently doesn't work with HDFS federation (hbase.rootdir with a
client that uses viewfs) because HLog#init uses
FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an
exception because there is no default filesystem in a viewfs client so
there's no way to determine a default block size or replication factor. They
could use the versions of these methods that take a path, however these were
introduced in HADOOP-8014 and are not yet available in Hadoop 1.x.

[jira] [Updated] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect


 [ 
https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-5974:
--

Attachment: HBASE-5974_94-V3.patch

 Scanner retry behavior with RPC timeout on next() seems incorrect
 -

 Key: HBASE-5974
 URL: https://issues.apache.org/jira/browse/HBASE-5974
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0
Reporter: Todd Lipcon
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.94.1

 Attachments: HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch, 
 HBASE-5974_94-V3.patch


 I'm seeing the following behavior:
 - set RPC timeout to a short value
 - call next() for some batch of rows, big enough so the client times out 
 before the result is returned
 - the HConnectionManager stuff will retry the next() call to the same server. 
 At this point, one of two things can happen: 1) the previous next() call will 
 still be processing, in which case you get a LeaseException, because it was 
 removed from the map during the processing, or 2) the next() call will 
 succeed but skip the prior batch of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect


[ 
https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287993#comment-13287993
 ] 

Anoop Sam John commented on HBASE-5974:
---

Patch addressing Ted's comments
{quote}
The above class can be made private. It should be named 
RegionServerWithScanTimeout.
{quote}
The class name is changed. But we can not make this private. If so RS impl 
class can not get instantiated.

 Scanner retry behavior with RPC timeout on next() seems incorrect
 -

 Key: HBASE-5974
 URL: https://issues.apache.org/jira/browse/HBASE-5974
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0
Reporter: Todd Lipcon
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.94.1

 Attachments: HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch, 
 HBASE-5974_94-V3.patch


 I'm seeing the following behavior:
 - set RPC timeout to a short value
 - call next() for some batch of rows, big enough so the client times out 
 before the result is returned
 - the HConnectionManager stuff will retry the next() call to the same server. 
 At this point, one of two things can happen: 1) the previous next() call will 
 still be processing, in which case you get a LeaseException, because it was 
 removed from the map during the processing, or 2) the next() call will 
 succeed but skip the prior batch of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect


[ 
https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287994#comment-13287994
 ] 

Anoop Sam John commented on HBASE-5974:
---

I will give patch fro trunk tomorrow.

 Scanner retry behavior with RPC timeout on next() seems incorrect
 -

 Key: HBASE-5974
 URL: https://issues.apache.org/jira/browse/HBASE-5974
 Project: HBase
  Issue Type: Bug
  Components: client, regionserver
Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0
Reporter: Todd Lipcon
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.94.1

 Attachments: HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch, 
 HBASE-5974_94-V3.patch


 I'm seeing the following behavior:
 - set RPC timeout to a short value
 - call next() for some batch of rows, big enough so the client times out 
 before the result is returned
 - the HConnectionManager stuff will retry the next() call to the same server. 
 At this point, one of two things can happen: 1) the previous next() call will 
 still be processing, in which case you get a LeaseException, because it was 
 removed from the map during the processing, or 2) the next() call will 
 succeed but skip the prior batch of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6145) Fix site target post modularization

2012-06-02 Thread Jesse Yates (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13287995#comment-13287995
 ] 

Jesse Yates commented on HBASE-6145:


A pretty in depth analysis, hopefully not too much:
Patch didn't apply cleanly with git, but got it go with basic patch command 
(mostly - looks like you might be a little behind in the docbkx?).
I'm just going to step through per pom…

hbase-server/pom.xml
{quote}
+  version${avro.version}/version
{quote}

This (and the rest of the dependency info) should be in the parent pom's 
dependencyManagement section. Just as general style, if another module wants to 
use avro, they should just declare the dependency, and not worry about making 
sure they have the right version, excludes, etc (I know we are dropping avro 
soon, but we should still do the right thing). 

Also, looks like your spacing is off for the added dependencies - 2 spaces, not 
tabs in xml.

hbase-common/pom.xml
{quote}
+plugins
+  plugin
+artifactIdmaven-surefire-plugin/artifactId
{quote}

This can/should stay in the pluginManagement section. Surefire is part of the 
default maven things to run, so it will just pick up the 
configuration/executions from the management section - this also keeps all the 
surefire stuff in the same sections in all poms.

Also, does anything actually happen if we remove:
{quote}
+  plugin
+groupIdorg.apache.maven.plugins/groupId
+artifactIdmaven-site-plugin/artifactId
{quote}

from this pom? It may create the target/site (side effect of site being a core 
part of maven, so all modules can respond to it), but is any actual work done?

Okay, onto the parent pom (/hbase/pom.xml):

Why the removal of the hbase-assembly module? In the official docs, it actually 
says to use an assembly module. This is particularly poignant because the 
alternative is to use the assembly:assembly descriptor, which is deprecated… 
The docs say that you can use assembly:assembly from within the parent pom (but 
doesn't say what that actually means) - any way to tie the assembly:single 
phase to call the assembly:single/:assembly phase in the children poms?

In src/assembly/all.xml, this comment is no longer applicable. 
{quote}
!-- This is only necessary until maven fixes the intra-project dependency 
bug 
  in maven 3.0. Until then, we have to include the test jars for 
sub-projects. When 
  fixed, the below dependencySet stuff is sufficient for pulling in the 
test jars as 
  well, as long as they are added as dependencies in this project. Right 
now, we only 
  have 1 submodule to accumulate, but we can copy/paste as necessary until 
maven is 
  fixed. --
{quote}

Also, it would be awesome to move file set matching to a more general regex, 
rather than tying it to the maven property (which is defined in the main pom). 

General nit: I prefer having the properties above the build, since the 
properties are used in the build section, but that's just style.

{quote}
!--Pass -DskipJavadoc=true on command-line to skip javadoc building--
{quote}
when building the site? Also, can't you just pass in -DskipJavadoc?

{quote}
version${maven.assembly.version}/version
{quote}
and 
{quote}
version${maven.site.version}/version
{quote}
Would be nice to add:
{code}
!--$NO-MVN-MAN-VER$--
{code}
to the end of the lines to remove the eclipse warning

{quote}
plugin
  groupIdorg.codehaus.mojo/groupId
  artifactIdxml-maven-plugin/artifactId
{quote}
configuration formatting is off.

{quote}
execution
  idcopy-docbkx/id
  goals
goalcopy-resources/goal
  /goals
  phasepre-site/phase
  configuration
outputDirectorytarget/site/outputDirectory
resources
  resource
directory${basedir}/target/docbkx/directory
includes
  include**/**/include
/includes
  /resource
/resources
  /configuration
/execution
  /executions
  configuration
escapeString\/escapeString
  /configuration
{quote}
Is the escape string new here? What property are you avoiding overriding? Also, 
why are you moving them in the first place? You can just set the target 
directory to be ${basedir}/target/docbkx in the docbkx plugin:
{code}
 groupIdcom.agilejava.docbkx/groupId
  artifactIddocbkx-maven-plugin/artifactId
{code}

Here, you can also move the common traits to the 'top level' configuration for 
the plugin, then just put the differences in each execution. For example:
{code}
plugin
  groupIdcom.agilejava.docbkx/groupId
  artifactIddocbkx-maven-plugin/artifactId
  version2.0.14/version

[jira] [Commented] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem


[ 
https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288017#comment-13288017
 ] 

Zhihong Yu commented on HBASE-6067:
---

@Eli:
Do you think the patch is Okay.

 HBase won't start when hbase.rootdir uses ViewFileSystem
 

 Key: HBASE-6067
 URL: https://issues.apache.org/jira/browse/HBASE-6067
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Eli Collins
Assignee: Eli Collins
 Attachments: 6067.txt


 HBase currently doesn't work with HDFS federation (hbase.rootdir with a 
 client that uses viewfs) because HLog#init uses 
 FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an 
 exception because there is no default filesystem in a viewfs client so 
 there's no way to determine a default block size or replication factor. They 
 could use the versions of these methods that take a path, however these were 
 introduced in HADOOP-8014 and are not yet available in Hadoop 1.x.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem


[ 
https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288017#comment-13288017
 ] 

Zhihong Yu edited comment on HBASE-6067 at 6/2/12 10:11 PM:


@Eli:
Do you think the patch is Okay ?

  was (Author: zhi...@ebaysf.com):
@Eli:
Do you think the patch is Okay.
  
 HBase won't start when hbase.rootdir uses ViewFileSystem
 

 Key: HBASE-6067
 URL: https://issues.apache.org/jira/browse/HBASE-6067
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Eli Collins
Assignee: Eli Collins
 Attachments: 6067.txt


 HBase currently doesn't work with HDFS federation (hbase.rootdir with a 
 client that uses viewfs) because HLog#init uses 
 FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an 
 exception because there is no default filesystem in a viewfs client so 
 there's no way to determine a default block size or replication factor. They 
 could use the versions of these methods that take a path, however these were 
 introduced in HADOOP-8014 and are not yet available in Hadoop 1.x.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6145) Fix site target post modularization

2012-06-02 Thread stack (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288037#comment-13288037
]

stack commented on HBASE-6145:
--

On avro, its deprecated. I moved it out of top-level into module where its
used as a). encouraging its deprecation, and b). to get rid of some of the
noise avro was generating each time mvn dipped into a module.

How did you get that Unknown macro message? I don't see it when I run local?
W/ the -X flag? I see a version when avro does its stuff.

I moved avro back up to top-level, at least the pluginManagement section for
now.

I did pretty print on all the poms to fix indent issues: xmllint --format.

On hbase-common/pom.xml and surefire '...can/should stay in the
pluginManagement section', I don't think our modules should have a
pluginManagement section. It makes sense in top level or in a module IF this
module had submodules but otherwise a pluginManagement doesn't make sense (as I
understand it). I tried removing them from modules.

On site goal and the following '...but is any actual work done?'

There is. A site dir is made w/ css and images in it. I just checked. Seems
like the site dir is made anyways, in spite of these flags saying don't
generate a site. I just removed them.

On hbase/pom.xml, assembly:assembly is not deprecated. A distinction is made
between assembly:single and assembly:assembly. The former is for attaching to
the packaging or pre-package phase somewhere. The latter requires explicit
invocation on the command-line. In my comment above at 01/Jun/12 23:24, I talk
of how I looked at taking both routes and decided against the direction the
sonatype manual was encouraging because a). its an ugly hack (even the manual
allows so) and b). their technique attaches itself to package phase which means
a user who wants to do a basic jar build has to wait on maven copying around
fat dependencies and gzipping up packages all the while spewing their console
though all they want to do is check their jar builds. A third reason to avoid
hbase-assembly is that hbase-assembly would force an hbase-site too since
hbase-assembly would want to depend on hbase-site if the tarball was to include
documentation (the javadoc and jxr aggregations work fine up in parent, wasn't
sure how well they'd work in a submodule and it seemed wrong doing aggregations
in a submodule anyways). I think it better that we require you explicitly ask
for tarball packaging by adding the assembly:assembly to your command line (I
was afraid it would not work, that the dependency facility figuring which jars
to include would be broken but it seems fine).

Regards 'In src/assembly/all.xml, this comment is no longer applicable.' I
think it is. At least, w/o that section, I can't get the test jar to build in
(which is why you added that comment in first place I guess?).

bq. Also, it would be awesome to move file set matching to a more general
regex, rather than tying it to the maven property (which is defined in the main
pom).

I suppose. I don't trust mvn to do anything right. Therefore my tendency is
to keep it dumb.

bq. General nit: I prefer having the properties above the build, since the
properties are used in the build section, but that's just style.

Yeah. I kept looking for them above ... but this is not my change. This is
how it was. We can change it in another issue?

bq. ...when building the site? Also, can't you just pass in -DskipJavadoc?

You mean instead of -DskipJavadoc=true?

I just tried it, and yes, seems to work. Let me change the comment.

Again, how do you get those Unknown macro outputs? I tried w/ -X and it
doesn't show.

I don't know what NO-MVN-MAN-VER does (google'd it but no explaination after
clicking ~10 links).

Escape string is not new. Copied from what currently exists.

On docbkx, they are built into target/docbkx, and then on site build, copied
under site dir. Similar to javadoc. Keeping it a little independent of site
in case someone is working w/ docbkx only, and not interested in site (This is
how it used work).

For docbkx configuration, this is how it was. I tried your suggestion of
moving common config above the executions and that seems to work. Good. Fixed.

bq. With the aggregate goal, you can just have all the javadocs copied into the
right directory in this module when you build them; at least that is what is
was doing before.

This is a change you made, that javadoc was aggregated into site/apidocs.
Previous to this, pre-modularization, javadocs were made into target/apidocs
and then copied into site when we ran site goal. We could go that way but
seemed strange building into site if not interested in 'site': i.e. you are
just making javadocs to check them out, etc.

The xmllint --format should take care of indents and tabs.

On fixing eclipse, can we do that in another

[jira] [Updated] (HBASE-6145) Fix site target post modularization

2012-06-02 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6145:
-

Attachment: 6145v4.txt

Address Jesse comments.

 Fix site target post modularization
 ---

 Key: HBASE-6145
 URL: https://issues.apache.org/jira/browse/HBASE-6145
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
 Attachments: 6145v4.txt, site.txt, site2.txt, sitev3.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6145) Fix site target post modularization

2012-06-02 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288042#comment-13288042
 ] 

stack commented on HBASE-6145:
--

I made HBASE-6154 to address stuff not included in this patch.

Jesse, I won't commit, not till I get your blessing (you might have problem w/ 
some of my response above)

 Fix site target post modularization
 ---

 Key: HBASE-6145
 URL: https://issues.apache.org/jira/browse/HBASE-6145
 Project: HBase
  Issue Type: Task
Reporter: stack
Assignee: stack
 Attachments: 6145v4.txt, site.txt, site2.txt, sitev3.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem

2012-06-02 Thread Eli Collins (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288045#comment-13288045
 ] 

Eli Collins commented on HBASE-6067:


Zhihong,
Approach seems reasonable to me. I'd make it consistent with 
getNumCurrentReplicas and use a Method member. Also, think you're missing a 
call to setAccessible?
Worth checking out why there's a findbugs warning as well.

HBase gang - can someone make Zhihong a contributor and assign this to him?

 HBase won't start when hbase.rootdir uses ViewFileSystem
 

 Key: HBASE-6067
 URL: https://issues.apache.org/jira/browse/HBASE-6067
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Eli Collins
Assignee: Eli Collins
 Attachments: 6067.txt


 HBase currently doesn't work with HDFS federation (hbase.rootdir with a 
 client that uses viewfs) because HLog#init uses 
 FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an 
 exception because there is no default filesystem in a viewfs client so 
 there's no way to determine a default block size or replication factor. They 
 could use the versions of these methods that take a path, however these were 
 introduced in HADOOP-8014 and are not yet available in Hadoop 1.x.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HBASE-6155) [copytable] Unexpected behavior if --starttime is not specifed but --endtime is.

2012-06-02 Thread Jonathan Hsieh (JIRA)

Jonathan Hsieh created HBASE-6155:
-

 Summary: [copytable] Unexpected behavior if --starttime is not 
specifed but --endtime is.
 Key: HBASE-6155
 URL: https://issues.apache.org/jira/browse/HBASE-6155
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.94.0, 0.92.1, 0.90.6, 0.96.0
Reporter: Jonathan Hsieh


If one uses copytable and specifies only an endtime, I'd expect to include all 
rows from unix epoch time upto the specified endtime.  Instead, it copies all 
the rows.  

The workaround for copies with this kind of range is to specify --startime=1 
(Note not --starttime=0), which is also unintuitive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6145) Fix site target post modularization

2012-06-02 Thread Hadoop QA (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288050#comment-13288050
]

Hadoop QA commented on HBASE-6145:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12530664/6145v4.txt
against trunk revision .

-1 @author. The patch appears to contain 3 @author tags which the Hadoop
community has agreed to not allow in code contributions.

+1 tests included. The patch appears to include 16 new or modified tests.

+1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.client.TestFromClientSide
org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks

org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/2089//testReport/
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/2089//console

This message is automatically generated.

Fix site target post modularization
---

Key: HBASE-6145
URL: https://issues.apache.org/jira/browse/HBASE-6145
Project: HBase
Issue Type: Task
Reporter: stack
Assignee: stack
Attachments: 6145v4.txt, site.txt, site2.txt, sitev3.txt

[jira] [Commented] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem


[ 
https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13288057#comment-13288057
 ] 

Zhihong Yu commented on HBASE-6067:
---

fs.getDefaultBlockSize() is only called in one place:
{code}
this.blocksize = conf.getLong(hbase.regionserver.hlog.blocksize,
getDefaultBlockSize());
{code}
So I didn't a Method member.
I will upload a new patch with setAccessible() call.

 HBase won't start when hbase.rootdir uses ViewFileSystem
 

 Key: HBASE-6067
 URL: https://issues.apache.org/jira/browse/HBASE-6067
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Eli Collins
Assignee: Eli Collins
 Attachments: 6067.txt


 HBase currently doesn't work with HDFS federation (hbase.rootdir with a 
 client that uses viewfs) because HLog#init uses 
 FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an 
 exception because there is no default filesystem in a viewfs client so 
 there's no way to determine a default block size or replication factor. They 
 could use the versions of these methods that take a path, however these were 
 introduced in HADOOP-8014 and are not yet available in Hadoop 1.x.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem


 [ 
https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu reassigned HBASE-6067:
-

Assignee: Zhihong Yu  (was: Eli Collins)

 HBase won't start when hbase.rootdir uses ViewFileSystem
 

 Key: HBASE-6067
 URL: https://issues.apache.org/jira/browse/HBASE-6067
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Eli Collins
Assignee: Zhihong Yu
 Attachments: 6067.txt


 HBase currently doesn't work with HDFS federation (hbase.rootdir with a 
 client that uses viewfs) because HLog#init uses 
 FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an 
 exception because there is no default filesystem in a viewfs client so 
 there's no way to determine a default block size or replication factor. They 
 could use the versions of these methods that take a path, however these were 
 introduced in HADOOP-8014 and are not yet available in Hadoop 1.x.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem


 [ 
https://issues.apache.org/jira/browse/HBASE-6067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-6067:
--

Attachment: 6067-v2.txt

Added setAccessible() call.

 HBase won't start when hbase.rootdir uses ViewFileSystem
 

 Key: HBASE-6067
 URL: https://issues.apache.org/jira/browse/HBASE-6067
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Reporter: Eli Collins
Assignee: Zhihong Yu
 Attachments: 6067-v2.txt, 6067.txt


 HBase currently doesn't work with HDFS federation (hbase.rootdir with a 
 client that uses viewfs) because HLog#init uses 
 FileSystem#getDefaultBlockSize and getDefaultReplication. These throw an 
 exception because there is no default filesystem in a viewfs client so 
 there's no way to determine a default block size or replication factor. They 
 could use the versions of these methods that take a path, however these were 
 introduced in HADOOP-8014 and are not yet available in Hadoop 1.x.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6067) HBase won't start when hbase.rootdir uses ViewFileSystem