[jira] [Created] (HBASE-10627) A logic mistake in HRegionServer isHealthy

2014-02-27 Thread Liu Shaohui (JIRA)
Liu Shaohui created HBASE-10627:
---

 Summary: A logic mistake in HRegionServer isHealthy
 Key: HBASE-10627
 URL: https://issues.apache.org/jira/browse/HBASE-10627
 Project: HBase
  Issue Type: Bug
Reporter: Liu Shaohui
Priority: Minor


After visiting the isHealthy in HRegionServer, I think there is a logic mistake.
{code}
// Verify that all threads are alive
if (!(leases.isAlive()
&& cacheFlusher.isAlive() && hlogRoller.isAlive()
&& this.compactionChecker.isAlive())   < logic wrong here
&& this.periodicFlusher.isAlive()) {
  stop("One or more threads are no longer alive -- stop");
  return false;
}
{code}

which should be
{code}
// Verify that all threads are alive
if (!(leases.isAlive()
&& cacheFlusher.isAlive() && hlogRoller.isAlive()
&& this.compactionChecker.isAlive()
&& this.periodicFlusher.isAlive())) {
  stop("One or more threads are no longer alive -- stop");
  return false;
}
{code}

Please finger out if i am wrong. Thx




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10628) Fix semantic inconsistency among methods which are exposed to client

2014-02-27 Thread Feng Honghua (JIRA)
Feng Honghua created HBASE-10628:


 Summary: Fix semantic inconsistency among methods which are 
exposed to client
 Key: HBASE-10628
 URL: https://issues.apache.org/jira/browse/HBASE-10628
 Project: HBase
  Issue Type: Bug
  Components: Client, master
Reporter: Feng Honghua
Assignee: Feng Honghua


This serves as a placeholder jira for inconsistency of client methods such as 
listTables / tableExists / getTableDescriptor described in HBASE-10584 and 
HBASE-10595, and also some other semantic fix.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Order of the fields in REST JSon calls?

2014-02-27 Thread Steve Loughran
JSON libs are pretty inconsistent about some things, like what to do if
there is >1 instance of the same field (some pick one, others create an
array). anything that uses a hashmap internally may pick an ordering based
on the hash keys, while other libs do things that make no sense whatsoever:
http://steveloughran.blogspot.co.uk/2012/02/just-because-you-can-rewrite-your.html

I don't think there is a good solution here except avoid some troublespots
(those duplicate entries), warn about orderings, maybe even have tests for
that.


On 26 February 2014 19:09, Jean-Marc Spaggiari wrote:

> Hum. I see
>
>
> https://issues.apache.org/jira/browse/HBASE-9435?focusedCommentId=13782477&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13782477
>
> I will take a quick look to see if we have very simple option to bypass
> current issue with 0.94 without having to make this compatible. Else, will
> just indicate this link to 10617.
>
>
> 2014-02-26 14:04 GMT-05:00 Andrew Purtell :
>
> > On Wed, Feb 26, 2014 at 10:55 AM, Jean-Marc Spaggiari <
> > jean-m...@spaggiari.org> wrote:
> >
> > > I'm not sure.
> > >
> > > Here is a comment from the patch: "The patch is backward compatible
> > except
> > > for StorageClusterStatusModel, which is broken anyway. It only shows
> one
> > > node in the liveNodes field." So it might be?
> > >
> > > 2014-02-26 13:48 GMT-05:00 Ted Yu :
> > >
> > > > The API changes from HBASE-9435 are incompatible changes, right ?
> > >
> >
> > Yes
> >
> >
> > --
> > Best regards,
> >
> >- Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Created] (HBASE-10629) Fix incorrect handling of IE that restores current thread's interrupt status within while/for loops

2014-02-27 Thread Feng Honghua (JIRA)
Feng Honghua created HBASE-10629:


 Summary: Fix incorrect handling of IE that restores current 
thread's interrupt status within while/for loops
 Key: HBASE-10629
 URL: https://issues.apache.org/jira/browse/HBASE-10629
 Project: HBase
  Issue Type: Bug
  Components: Client, master, regionserver, Replication
Reporter: Feng Honghua
Assignee: Feng Honghua


There are about three kinds of typical incorrect handling of IE thrown during 
sleep() in current code base:
# Shadow it totally -- Has been fixed by HBASE-10497
# Restore current thread's interrupt status implicitly within while/for loops 
(Threads.sleep() being called within while/for loops)  -- Has been fixed by 
HBASE-10516
# Restore current thread's interrupt status explicitly within while/for loops 
(directly interrupt current thread within while/for loops)

There are still places with the last kind of handling error, and as 
HBASE-10497/HBASE-10516, the last kind of errors should be fixed according to 
their real scenarios case by case. This is created to serve as a parent jira to 
fix the last kind errors in a systematic manner



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HBASE-9469) Synchronous replication

2014-02-27 Thread Feng Honghua (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Honghua resolved HBASE-9469.
-

Resolution: Won't Fix

It has less value than expected as described in last comment

> Synchronous replication
> ---
>
> Key: HBASE-9469
> URL: https://issues.apache.org/jira/browse/HBASE-9469
> Project: HBase
>  Issue Type: New Feature
>Reporter: Feng Honghua
>
> Scenario: 
> A/B clusters with master-master replication, client writes to A cluster and A 
> pushes all writes to B cluster, and when A cluster is down, client switches 
> writing to B cluster.
> But the client's write switch is unsafe due to the replication between A/B is 
> asynchronous: a delete to B cluster which aims to delete a put written 
> earlier can fail due to that put is written to A cluster and isn't 
> successfully pushed to B before A is down. It can be worse if this delete is 
> collected(flush and then major compact occurs) before A cluster is up and 
> that put is eventually pushed to B, the put won't ever be deleted.
> Can we provide per-table/per-peer synchronous replication which ships the 
> according hlog entry of write before responsing write success to client? By 
> this we can guarantee the client that all write requests for which he got 
> success response when he wrote to A cluster must already have been in B 
> cluster as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10630) NullPointerException in ConnectionManager$HConnectionImplementation.locateRegionInMeta() due to missing region info

2014-02-27 Thread Ted Yu (JIRA)
Ted Yu created HBASE-10630:
--

 Summary: NullPointerException in 
ConnectionManager$HConnectionImplementation.locateRegionInMeta() due to missing 
region info
 Key: HBASE-10630
 URL: https://issues.apache.org/jira/browse/HBASE-10630
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Ted Yu
Assignee: Ted Yu


During Load And Verify With Chaos Monkey test, we observed:
{code}
2014-02-26 16:28:17,964|beaver.machine|INFO|2014-02-26 16:28:17,964 INFO  
[main] mapreduce.Job:  map 71% reduce 0%
2014-02-26 16:28:20,073|beaver.machine|INFO|2014-02-26 16:28:20,073 INFO  
[main] mapreduce.Job:  map 82% reduce 0%
2014-02-26 16:28:20,077|beaver.machine|INFO|2014-02-26 16:28:20,077 INFO  
[main] mapreduce.Job: Task Id : attempt_1393409213482_0015_m_68_0, Status : 
FAILED
2014-02-26 16:28:20,099|beaver.machine|INFO|Error: 
java.lang.NullPointerException
2014-02-26 16:28:20,100|beaver.machine|INFO|at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1175)
2014-02-26 16:28:20,100|beaver.machine|INFO|at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1038)
2014-02-26 16:28:20,100|beaver.machine|INFO|at 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionAll(ConnectionManager.java:986)
2014-02-26 16:28:20,101|beaver.machine|INFO|at 
org.apache.hadoop.hbase.client.AsyncProcess.findDestLocation(AsyncProcess.java:418)
2014-02-26 16:28:20,101|beaver.machine|INFO|at 
org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:343)
2014-02-26 16:28:20,101|beaver.machine|INFO|at 
org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:296)
2014-02-26 16:28:20,102|beaver.machine|INFO|at 
org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:1024)
2014-02-26 16:28:20,102|beaver.machine|INFO|at 
org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1298)
2014-02-26 16:28:20,102|beaver.machine|INFO|at 
org.apache.hadoop.hbase.test.IntegrationTestLoadAndVerify$LoadMapper.cleanup(IntegrationTestLoadAndVerify.java:188)
2014-02-26 16:28:20,102|beaver.machine|INFO|at 
org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:148)
2014-02-26 16:28:20,103|beaver.machine|INFO|at 
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
2014-02-26 16:28:20,103|beaver.machine|INFO|at 
org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
2014-02-26 16:28:20,103|beaver.machine|INFO|at 
org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
2014-02-26 16:28:20,103|beaver.machine|INFO|at 
java.security.AccessController.doPrivileged(Native Method)
2014-02-26 16:28:20,104|beaver.machine|INFO|at 
javax.security.auth.Subject.doAs(Subject.java:396)
2014-02-26 16:28:20,104|beaver.machine|INFO|at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
2014-02-26 16:28:20,104|beaver.machine|INFO|at 
org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
2014-02-26 16:28:20,105|beaver.machine|INFO|
2014-02-26 16:28:20,105|beaver.machine|INFO|Container killed by the 
ApplicationMaster.
{code}
Here is related code:
{code}
   // convert the row result into the HRegionLocation we need!
   location = MetaReader.getRegionLocations(regionInfoRow);
   HRegionInfo regionInfo = 
location.getRegionLocation().getRegionInfo();
   if (regionInfo == null) {
 throw new IOException("HRegionInfo was null or empty in " +
{code}
null check should be performed against location and 
location.getRegionLocation().



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Why doesn't KeyValue.equals/CellComparator compare the values?

2014-02-27 Thread Stack
On Wed, Feb 26, 2014 at 8:31 PM, Matt Corgan  wrote:


> But maybe one of the committers could add a sentence to emphasize that
> value is excluded.
>
>
We should underline that data is not considered comparing Cells
(KeyValues).  Apart from the fact that it could make for some interesting
performance issues, the system isn't plumbed for dealing with coordinates
that differ in their value only.  Rather, the mvcc/sequenceid is used
splitting Cells whose coordinates are otherwise the same).

What was your expectation mighty Cosmin?  What you think HBase should do
with values that differ in value only?

Thanks,
St.Ack


[jira] [Resolved] (HBASE-10630) NullPointerException in ConnectionManager$HConnectionImplementation.locateRegionInMeta() due to missing region info

2014-02-27 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-10630.


  Resolution: Fixed
Hadoop Flags: Reviewed

Integrated to 10070 branch.

> NullPointerException in 
> ConnectionManager$HConnectionImplementation.locateRegionInMeta() due to 
> missing region info
> ---
>
> Key: HBASE-10630
> URL: https://issues.apache.org/jira/browse/HBASE-10630
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 10630-v1.txt
>
>
> During Load And Verify With Chaos Monkey test, we observed:
> {code}
> 2014-02-26 16:28:17,964|beaver.machine|INFO|2014-02-26 16:28:17,964 INFO  
> [main] mapreduce.Job:  map 71% reduce 0%
> 2014-02-26 16:28:20,073|beaver.machine|INFO|2014-02-26 16:28:20,073 INFO  
> [main] mapreduce.Job:  map 82% reduce 0%
> 2014-02-26 16:28:20,077|beaver.machine|INFO|2014-02-26 16:28:20,077 INFO  
> [main] mapreduce.Job: Task Id : attempt_1393409213482_0015_m_68_0, Status 
> : FAILED
> 2014-02-26 16:28:20,099|beaver.machine|INFO|Error: 
> java.lang.NullPointerException
> 2014-02-26 16:28:20,100|beaver.machine|INFO|at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1175)
> 2014-02-26 16:28:20,100|beaver.machine|INFO|at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1038)
> 2014-02-26 16:28:20,100|beaver.machine|INFO|at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionAll(ConnectionManager.java:986)
> 2014-02-26 16:28:20,101|beaver.machine|INFO|at 
> org.apache.hadoop.hbase.client.AsyncProcess.findDestLocation(AsyncProcess.java:418)
> 2014-02-26 16:28:20,101|beaver.machine|INFO|at 
> org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:343)
> 2014-02-26 16:28:20,101|beaver.machine|INFO|at 
> org.apache.hadoop.hbase.client.AsyncProcess.submit(AsyncProcess.java:296)
> 2014-02-26 16:28:20,102|beaver.machine|INFO|at 
> org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:1024)
> 2014-02-26 16:28:20,102|beaver.machine|INFO|at 
> org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1298)
> 2014-02-26 16:28:20,102|beaver.machine|INFO|at 
> org.apache.hadoop.hbase.test.IntegrationTestLoadAndVerify$LoadMapper.cleanup(IntegrationTestLoadAndVerify.java:188)
> 2014-02-26 16:28:20,102|beaver.machine|INFO|at 
> org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:148)
> 2014-02-26 16:28:20,103|beaver.machine|INFO|at 
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> 2014-02-26 16:28:20,103|beaver.machine|INFO|at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
> 2014-02-26 16:28:20,103|beaver.machine|INFO|at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> 2014-02-26 16:28:20,103|beaver.machine|INFO|at 
> java.security.AccessController.doPrivileged(Native Method)
> 2014-02-26 16:28:20,104|beaver.machine|INFO|at 
> javax.security.auth.Subject.doAs(Subject.java:396)
> 2014-02-26 16:28:20,104|beaver.machine|INFO|at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> 2014-02-26 16:28:20,104|beaver.machine|INFO|at 
> org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> 2014-02-26 16:28:20,105|beaver.machine|INFO|
> 2014-02-26 16:28:20,105|beaver.machine|INFO|Container killed by the 
> ApplicationMaster.
> {code}
> Here is related code:
> {code}
>// convert the row result into the HRegionLocation we need!
>location = MetaReader.getRegionLocations(regionInfoRow);
>HRegionInfo regionInfo = 
> location.getRegionLocation().getRegionInfo();
>if (regionInfo == null) {
>  throw new IOException("HRegionInfo was null or empty in " +
> {code}
> null check should be performed against location and 
> location.getRegionLocation().



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10631) Avoid extra seek on FileLink open

2014-02-27 Thread Matteo Bertozzi (JIRA)
Matteo Bertozzi created HBASE-10631:
---

 Summary: Avoid extra seek on FileLink open
 Key: HBASE-10631
 URL: https://issues.apache.org/jira/browse/HBASE-10631
 Project: HBase
  Issue Type: Bug
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Attachments: HBASE-10631-v0.patch

There is an extra seek(0) on FileLink open, that we can skip



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10632) Region lost in limbo after ArrayIndexOutOfBoundsException during assignment

2014-02-27 Thread Nick Dimiduk (JIRA)
Nick Dimiduk created HBASE-10632:


 Summary: Region lost in limbo after ArrayIndexOutOfBoundsException 
during assignment
 Key: HBASE-10632
 URL: https://issues.apache.org/jira/browse/HBASE-10632
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: hbase-10070
Reporter: Nick Dimiduk
Assignee: Enis Soztutar
 Fix For: 0.99.0, hbase-10070


Discovered while running IntegrationTestBigLinkedList. Region 
24d68aa7239824e42390a77b7212fcbf is scheduled for move from hor13n19 to 
hor13n13. During the process an exception is thrown.

{noformat}
2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
master.RegionStates: Transitioning {24d68aa7239824e42390a77b7212fcbf 
state=OPENING, ts=1393342207107, 
server=hor13n19.gq1.ygridcore.net,60020,1393341563552} will be handled by SSH 
for hor13n19.gq1.ygridcore.net,60020,1393341563552
2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
handler.ServerShutdownHandler: Reassigning 7 region(s) that 
hor13n19.gq1.ygridcore.net,60020,1393341563552 was carrying (and 0 regions(s) 
that were opening on this server)
2014-02-25 15:30:42,613 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
handler.ServerShutdownHandler: Reassigning region with rs = 
{24d68aa7239824e42390a77b7212fcbf state=OPENING, ts=1393342207107, 
server=hor13n19.gq1.ygridcore.net,60020,1393341563552} and deleting zk node if 
exists
2014-02-25 15:30:42,623 INFO  [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
master.RegionStates: Transitioned {24d68aa7239824e42390a77b7212fcbf 
state=OPENING, ts=1393342207107, 
server=hor13n19.gq1.ygridcore.net,60020,1393341563552} to 
{24d68aa7239824e42390a77b7212fcbf state=OFFLINE, ts=1393342242623, 
server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
2014-02-25 15:30:42,623 DEBUG [AM.ZK.Worker-pool2-t46] 
master.AssignmentManager: Znode 
IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
 deleted, state: {24d68aa7239824e42390a77b7212fcbf state=OFFLINE, 
ts=1393342242623, server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
...
2014-02-25 15:30:43,993 ERROR [MASTER_SERVER_OPERATIONS-hor13n12:6-4] 
executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN
java.lang.ArrayIndexOutOfBoundsException: 0
at 
org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer$Cluster.(BaseLoadBalancer.java:250)
at 
org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:921)
at 
org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.roundRobinAssignment(BaseLoadBalancer.java:860)
at 
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:2482)
at 
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:282)
at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
{noformat}

After that, region is left in limbo and is never reassigned.

{noformat}
2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
master.HMaster: Client=hrt_qa//68.142.246.29 move 
hri=IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.,
 src=hor13n19.gq1.ygridcore.net,60020,1393341563552, 
dest=hor13n13.gq1.ygridcore.net,60020,139334275, running balancer
2014-02-25 15:35:11,581 INFO  [FifoRpcScheduler.handler1-thread-6] 
master.AssignmentManager: Ignored moving region not assigned: {ENCODED => 
24d68aa7239824e42390a77b7212fcbf, NAME => 
'IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.',
 STARTKEY => '\x80\x06\x1A', ENDKEY => ''}, {24d68aa7239824e42390a77b7212fcbf 
state=OFFLINE, ts=1393342242623, 
server=hor13n19.gq1.ygridcore.net,60020,1393341563552}
...
2014-02-25 15:35:26,586 DEBUG 
[hor13n12.gq1.ygridcore.net,6,1393341917402-BalancerChore] master.HMaster: 
Not running balancer because 1 region(s) in transition: 
{24d68aa7239824e42390a77b7212fcbf={24d68aa7239824e42390a77b7212fcbf 
state=OFFLINE, ts=1393342242623, 
server=hor13n19.gq1.ygridcore.net,60020,1393341563552}}
...
2014-02-25 15:35:51,945 DEBUG [FifoRpcScheduler.handler1-thread-16] 
master.HMaster: Client=hrt_qa//68.142.246.29 unassign 
IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf.
 in current location if it is online and reassign.force=false
2014-02-25 15:35:51,945 DEBUG [FifoRpcScheduler.handler1-thread-16] 
master.AssignmentManager: Starting unassign of 
IntegrationTestBigLinkedList,\x80\x06\x1A,1393342105093.24d68aa7239824e42390a77b7212fcbf

[jira] [Created] (HBASE-10633) StoreFileRefresherChore throws ConcurrentModificationException sometimes

2014-02-27 Thread Devaraj Das (JIRA)
Devaraj Das created HBASE-10633:
---

 Summary: StoreFileRefresherChore throws 
ConcurrentModificationException sometimes
 Key: HBASE-10633
 URL: https://issues.apache.org/jira/browse/HBASE-10633
 Project: HBase
  Issue Type: Sub-task
Reporter: Devaraj Das
Assignee: Devaraj Das






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10634) Multiget doesn't fully work

2014-02-27 Thread Devaraj Das (JIRA)
Devaraj Das created HBASE-10634:
---

 Summary: Multiget doesn't fully work
 Key: HBASE-10634
 URL: https://issues.apache.org/jira/browse/HBASE-10634
 Project: HBase
  Issue Type: Sub-task
Reporter: Devaraj Das






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HBASE-10633) StoreFileRefresherChore throws ConcurrentModificationException sometimes

2014-02-27 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das resolved HBASE-10633.
-

   Resolution: Fixed
Fix Version/s: hbase-10070

Committed. Thanks for the quick review [~enis].

> StoreFileRefresherChore throws ConcurrentModificationException sometimes
> 
>
> Key: HBASE-10633
> URL: https://issues.apache.org/jira/browse/HBASE-10633
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: hbase-10070
>
> Attachments: 10633-1.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10635) thrift#TestThriftServer fails due to TTL validity check

2014-02-27 Thread Ted Yu (JIRA)
Ted Yu created HBASE-10635:
--

 Summary: thrift#TestThriftServer fails due to TTL validity check
 Key: HBASE-10635
 URL: https://issues.apache.org/jira/browse/HBASE-10635
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu


>From 
>https://builds.apache.org/job/HBase-TRUNK/4960/testReport/junit/org.apache.hadoop.hbase.thrift/TestThriftServer/testAll/
> :
{code}
IOError(message:org.apache.hadoop.hbase.DoNotRetryIOException: TTL for column 
family columnA  must be positive. Set hbase.table.sanity.checks to false at 
conf or table descriptor if you want to bypass sanity checks
at 
org.apache.hadoop.hbase.master.HMaster.sanityCheckTableDescriptor(HMaster.java:1824)
at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1750)
at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1876)
at 
org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:40470)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2016)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
at 
org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:73)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
)
at 
org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.createTable(ThriftServerRunner.java:971)
at 
org.apache.hadoop.hbase.thrift.TestThriftServer.createTestTables(TestThriftServer.java:224)
at 
org.apache.hadoop.hbase.thrift.TestThriftServer.doTestTableCreateDrop(TestThriftServer.java:140)
at 
org.apache.hadoop.hbase.thrift.TestThriftServer.doTestTableCreateDrop(TestThriftServer.java:136)
at 
org.apache.hadoop.hbase.thrift.TestThriftServer.testAll(TestThriftServer.java:115)
{code}
Looks like ColumnDescriptor contains TTL of -1.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10636) HBaseAdmin.deleteTable isn't 'really' synchronous in that still some cleanup in HMaster after client thinks deleteTable() succeeds

2014-02-27 Thread Feng Honghua (JIRA)
Feng Honghua created HBASE-10636:


 Summary: HBaseAdmin.deleteTable isn't 'really' synchronous in that 
still some cleanup in HMaster after client thinks deleteTable() succeeds
 Key: HBASE-10636
 URL: https://issues.apache.org/jira/browse/HBASE-10636
 Project: HBase
  Issue Type: Sub-task
  Components: Client, master
Reporter: Feng Honghua
Assignee: Feng Honghua


In HBaseAdmin.deleteTable():
{code}
public void deleteTable(final TableName tableName) throws IOException {
// Wait until all regions deleted
for (int tries = 0; tries < (this.numRetries * this.retryLongerMultiplier); 
tries++) {
// let us wait until hbase:meta table is updated and
// HMaster removes the table from its HTableDescriptors
if (values == null || values.length == 0) {
  tableExists = false;
  GetTableDescriptorsResponse htds;
  MasterKeepAliveConnection master = 
connection.getKeepAliveMasterService();
  try {
GetTableDescriptorsRequest req =
  RequestConverter.buildGetTableDescriptorsRequest(tableName);
htds = master.getTableDescriptors(null, req);
  } catch (ServiceException se) {
throw ProtobufUtil.getRemoteException(se);
  } finally {
master.close();
  }
  tableExists = !htds.getTableSchemaList().isEmpty();
  if (!tableExists) {
break;
  }
}
  }
{code}
client thinks deleteTable succeeds once it can't retrieve back the 
tableDescriptor

But in HMaster, the DeleteTableHandler which really deletes the table:
{code}
  protected void handleTableOperation(List regions)
  throws IOException, KeeperException {
// 1. Wait because of region in transition

// 2. Remove regions from META
LOG.debug("Deleting regions from META");
MetaEditor.deleteRegions(this.server.getCatalogTracker(), regions);

// 3. Move the table in /hbase/.tmp
MasterFileSystem mfs = this.masterServices.getMasterFileSystem();
Path tempTableDir = mfs.moveTableToTemp(tableName);

try {
  // 4. Delete regions from FS (temp directory)
  FileSystem fs = mfs.getFileSystem();
  for (HRegionInfo hri: regions) {
LOG.debug("Archiving region " + hri.getRegionNameAsString() + " from 
FS");
HFileArchiver.archiveRegion(fs, mfs.getRootDir(),
tempTableDir, new Path(tempTableDir, hri.getEncodedName()));
  }

  // 5. Delete table from FS (temp directory)
  if (!fs.delete(tempTableDir, true)) {
LOG.error("Couldn't delete " + tempTableDir);
  }

  LOG.debug("Table '" + tableName + "' archived!");
} finally {
  // 6. Update table descriptor cache
  LOG.debug("Removing '" + tableName + "' descriptor.");
  this.masterServices.getTableDescriptors().remove(tableName);

  // 7. Clean up regions of the table in RegionStates.
  LOG.debug("Removing '" + tableName + "' from region states.");
  states.tableDeleted(tableName);

  // 8. If entry for this table in zk, and up in AssignmentManager, remove 
it.
  LOG.debug("Marking '" + tableName + "' as deleted.");
  am.getZKTable().setDeletedTable(tableName);
}

if (cpHost != null) {
  cpHost.postDeleteTableHandler(this.tableName);
}
  }
{code}
Removing regions out of RegionStates, Marking table deleted from ZK, Calling 
coprocessor's postDeleteTableHandler are all after the table is removed from 
TableDescriptor cache

So client code relying on RegionStates/ZKTable/CP being cleaned up after 
deleteTable() possibly fail, if client requests hit HMaster before those three 
cleanup are done...

Actually when I add some sleep such as 200ms after below line to simulate a 
possible slow-running HMaster
{code}
this.masterServices.getTableDescriptors().remove(tableName);
{code}
Some unit tests(such as moveRegion / confirming postDeleteTable CP immediately 
after deleteTable) can't pass no longer



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)