[jira] [Commented] (HBASE-10306) Backport HBASE-6820 to 0.94, MiniZookeeperCluster should ensure that ZKDatabase is closed upon shutdown()

2014-01-11 Thread chendihao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868719#comment-13868719
 ] 

chendihao commented on HBASE-10306:
---

Thanks for the review. [~lhofhansl] [~enis]

 Backport HBASE-6820 to 0.94, MiniZookeeperCluster should ensure that 
 ZKDatabase is closed upon shutdown()
 -

 Key: HBASE-10306
 URL: https://issues.apache.org/jira/browse/HBASE-10306
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.3
Reporter: chendihao
Assignee: chendihao
Priority: Minor
 Fix For: 0.94.16

 Attachments: HBASE-10306-0.94-v1.patch


 Backport HBASE-6820: [WINDOWS] MiniZookeeperCluster should ensure that 
 ZKDatabase is closed upon shutdown()



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9203) Secondary index support through coprocessors

2014-01-11 Thread Liang Xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868757#comment-13868757
 ] 

Liang Xie commented on HBASE-9203:
--

to me, one main drawback of this design is the required padding mechanism, for 
index name section inside index table's row key, maybe we can give a doc to 
educate end-user,  but other indexed column(s) value(s), are totally depend 
on real user scenario, e.g. most of indexed column value probably very short, 
say a, and a few long value, say abode...z, then even for the short a 
value, based on current design, we still need to pad to sth like a000..0, am 
i correct?   i don't have a better improvement idea through...

 Secondary index support through coprocessors
 

 Key: HBASE-9203
 URL: https://issues.apache.org/jira/browse/HBASE-9203
 Project: HBase
  Issue Type: New Feature
Reporter: rajeshbabu
Assignee: rajeshbabu
 Attachments: SecondaryIndex Design.pdf, SecondaryIndex 
 Design_Updated.pdf, SecondaryIndex Design_Updated_2.pdf


 We have been working on implementing secondary index in HBase and open 
 sourced  on hbase 0.94.8 version.
 The project is available on github.
 https://github.com/Huawei-Hadoop/hindex
 This Jira is to support secondary index on trunk(0.98).
 Following features will be supported.
 -  multiple indexes on table,
 -  multi column index,
 -  index based on part of a column value,
 -  equals and range condition scans using index, and
 -  bulk loading data to indexed table (Indexing done with bulk load)
 Most of the kernel changes needed for secondary index is available in trunk. 
 Very minimal changes needed for it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9203) Secondary index support through coprocessors

2014-01-11 Thread Liang Xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868759#comment-13868759
 ] 

Liang Xie commented on HBASE-9203:
--

another problem: about scan(e.g. there's a filter on the indexed column) 
performance. how to decide or evaluate when we do the query into user table 
directly and when we do the query into index table first then do the 
(multi-)get into user table ?

 Secondary index support through coprocessors
 

 Key: HBASE-9203
 URL: https://issues.apache.org/jira/browse/HBASE-9203
 Project: HBase
  Issue Type: New Feature
Reporter: rajeshbabu
Assignee: rajeshbabu
 Attachments: SecondaryIndex Design.pdf, SecondaryIndex 
 Design_Updated.pdf, SecondaryIndex Design_Updated_2.pdf


 We have been working on implementing secondary index in HBase and open 
 sourced  on hbase 0.94.8 version.
 The project is available on github.
 https://github.com/Huawei-Hadoop/hindex
 This Jira is to support secondary index on trunk(0.98).
 Following features will be supported.
 -  multiple indexes on table,
 -  multi column index,
 -  index based on part of a column value,
 -  equals and range condition scans using index, and
 -  bulk loading data to indexed table (Indexing done with bulk load)
 Most of the kernel changes needed for secondary index is available in trunk. 
 Very minimal changes needed for it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10296) Replace ZK with a paxos running within master processes to provide better master failover performance and state consistency

2014-01-11 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868787#comment-13868787
 ] 

Feng Honghua commented on HBASE-10296:
--

bq.but that ZK path is used to find the hbase master even if it moves round a 
cluster -what would happen there?
Typically we adopt master-based paxos in practice, so naturally the master 
process hosting the master paxos replica is the active master. the active 
master is elected by paxos protocal, not by zk. and each standby master knows 
who is the current active master. when the active master moves around(for 
instance when active master dies or its lease timeout), the client or app who 
attempts to talk with the old active master will fail in two ways: fail to 
connect if active master dies, or fail by knowing it's now not the active 
master and the current new active master info. for the former the client/app 
will try randomly other alive master instance and that master will accept its 
request if it's the new active master, or tell it the current active master 
info if it's not the current active master. for the latter it can now talk to 
the active master...and like how to access a zk, client/app should know the 
master assemble addresses to access a  HBase cluster. (assuming you're saying 
finding the active master, correct me if I'm wrong)

 Replace ZK with a paxos running within master processes to provide better 
 master failover performance and state consistency
 ---

 Key: HBASE-10296
 URL: https://issues.apache.org/jira/browse/HBASE-10296
 Project: HBase
  Issue Type: Brainstorming
  Components: master, Region Assignment, regionserver
Reporter: Feng Honghua

 Currently master relies on ZK to elect active master, monitor liveness and 
 store almost all of its states, such as region states, table info, 
 replication info and so on. And zk also plays as a channel for 
 master-regionserver communication(such as in region assigning) and 
 client-regionserver communication(such as replication state/behavior change). 
 But zk as a communication channel is fragile due to its one-time watch and 
 asynchronous notification mechanism which together can leads to missed 
 events(hence missed messages), for example the master must rely on the state 
 transition logic's idempotence to maintain the region assigning state 
 machine's correctness, actually almost all of the most tricky inconsistency 
 issues can trace back their root cause to the fragility of zk as a 
 communication channel.
 Replace zk with paxos running within master processes have following benefits:
 1. better master failover performance: all master, either the active or the 
 standby ones, have the same latest states in memory(except lag ones but which 
 can eventually catch up later on). whenever the active master dies, the newly 
 elected active master can immediately play its role without such failover 
 work as building its in-memory states by consulting meta-table and zk.
 2. better state consistency: master's in-memory states are the only truth 
 about the system,which can eliminate inconsistency from the very beginning. 
 and though the states are contained by all masters, paxos guarantees they are 
 identical at any time.
 3. more direct and simple communication pattern: client changes state by 
 sending requests to master, master and regionserver talk directly to each 
 other by sending request and response...all don't bother to using a 
 third-party storage like zk which can introduce more uncertainty, worse 
 latency and more complexity.
 4. zk can only be used as liveness monitoring for determining if a 
 regionserver is dead, and later on we can eliminate zk totally when we build 
 heartbeat between master and regionserver.
 I know this might looks like a very crazy re-architect, but it deserves deep 
 thinking and serious discussion for it, right?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10296) Replace ZK with a paxos running within master processes to provide better master failover performance and state consistency

2014-01-11 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868792#comment-13868792
 ] 

Feng Honghua commented on HBASE-10296:
--

bq.One aspect of ZK that is worth remembering is that it lets other apps keep 
an eye on what is going on
Yes, this is a good question. ZK's watch/notification pattern can be viewed as 
a communication mechanism: each ZK node represents a piece of data, app A 
updates this ZK node when it updates the data, then app B which has a watch on 
it will receives a notification when the data is updated.
If we use paxos to replace ZK, the data represented by each ZK node now is 
hosted within each master process' memory as an data structure, updated via the 
paxos replicated state machine triggered by client/regionserver requests. Now 
the watch/notification center is moved from ZK to master, and we can still use 
the node-watch-list mechanism for implementation which's used by ZK.
The above 'keep an eye on what is going on'(or watch/notify) now is changed in 
two ways:
1. master - zk - regionserver communication now is replaced by 
master-regionserver direct communication
2. client-zk-regionserver communication now is replaced by 
client-master-regionserver communication (master plays the role of original 
ZK)
a note: we can now provide more flexible options by exposing sync/async 
notification and one-time/permanent watch. by ZK only one-time async watch is 
provided.

 Replace ZK with a paxos running within master processes to provide better 
 master failover performance and state consistency
 ---

 Key: HBASE-10296
 URL: https://issues.apache.org/jira/browse/HBASE-10296
 Project: HBase
  Issue Type: Brainstorming
  Components: master, Region Assignment, regionserver
Reporter: Feng Honghua

 Currently master relies on ZK to elect active master, monitor liveness and 
 store almost all of its states, such as region states, table info, 
 replication info and so on. And zk also plays as a channel for 
 master-regionserver communication(such as in region assigning) and 
 client-regionserver communication(such as replication state/behavior change). 
 But zk as a communication channel is fragile due to its one-time watch and 
 asynchronous notification mechanism which together can leads to missed 
 events(hence missed messages), for example the master must rely on the state 
 transition logic's idempotence to maintain the region assigning state 
 machine's correctness, actually almost all of the most tricky inconsistency 
 issues can trace back their root cause to the fragility of zk as a 
 communication channel.
 Replace zk with paxos running within master processes have following benefits:
 1. better master failover performance: all master, either the active or the 
 standby ones, have the same latest states in memory(except lag ones but which 
 can eventually catch up later on). whenever the active master dies, the newly 
 elected active master can immediately play its role without such failover 
 work as building its in-memory states by consulting meta-table and zk.
 2. better state consistency: master's in-memory states are the only truth 
 about the system,which can eliminate inconsistency from the very beginning. 
 and though the states are contained by all masters, paxos guarantees they are 
 identical at any time.
 3. more direct and simple communication pattern: client changes state by 
 sending requests to master, master and regionserver talk directly to each 
 other by sending request and response...all don't bother to using a 
 third-party storage like zk which can introduce more uncertainty, worse 
 latency and more complexity.
 4. zk can only be used as liveness monitoring for determining if a 
 regionserver is dead, and later on we can eliminate zk totally when we build 
 heartbeat between master and regionserver.
 I know this might looks like a very crazy re-architect, but it deserves deep 
 thinking and serious discussion for it, right?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10296) Replace ZK with a paxos running within master processes to provide better master failover performance and state consistency

2014-01-11 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868800#comment-13868800
 ] 

Feng Honghua commented on HBASE-10296:
--

bq.The google chubby paper goes into some detail about why they implemented a 
Paxos Service and not a paxos library.
I believe google should have a paxos library, which is used in megastore and 
spanner, right? And this fact is mentioned in a google paper *paxos made live* 
:-)
Implementing paxos as a standalone/shared service or a library has their own 
benefits and drawbacks.
A service: simple API and simple for app to use, can be shared by multiple 
apps; but abuse by one app can negatively influence other apps using the same 
paxos service (we ever encountered several times such cases before :-()
A library: more difficult for app to use, but have better isolation level(won't 
be affected by possible abuse from other app), and have more primitives and 
more flexibility.

 Replace ZK with a paxos running within master processes to provide better 
 master failover performance and state consistency
 ---

 Key: HBASE-10296
 URL: https://issues.apache.org/jira/browse/HBASE-10296
 Project: HBase
  Issue Type: Brainstorming
  Components: master, Region Assignment, regionserver
Reporter: Feng Honghua

 Currently master relies on ZK to elect active master, monitor liveness and 
 store almost all of its states, such as region states, table info, 
 replication info and so on. And zk also plays as a channel for 
 master-regionserver communication(such as in region assigning) and 
 client-regionserver communication(such as replication state/behavior change). 
 But zk as a communication channel is fragile due to its one-time watch and 
 asynchronous notification mechanism which together can leads to missed 
 events(hence missed messages), for example the master must rely on the state 
 transition logic's idempotence to maintain the region assigning state 
 machine's correctness, actually almost all of the most tricky inconsistency 
 issues can trace back their root cause to the fragility of zk as a 
 communication channel.
 Replace zk with paxos running within master processes have following benefits:
 1. better master failover performance: all master, either the active or the 
 standby ones, have the same latest states in memory(except lag ones but which 
 can eventually catch up later on). whenever the active master dies, the newly 
 elected active master can immediately play its role without such failover 
 work as building its in-memory states by consulting meta-table and zk.
 2. better state consistency: master's in-memory states are the only truth 
 about the system,which can eliminate inconsistency from the very beginning. 
 and though the states are contained by all masters, paxos guarantees they are 
 identical at any time.
 3. more direct and simple communication pattern: client changes state by 
 sending requests to master, master and regionserver talk directly to each 
 other by sending request and response...all don't bother to using a 
 third-party storage like zk which can introduce more uncertainty, worse 
 latency and more complexity.
 4. zk can only be used as liveness monitoring for determining if a 
 regionserver is dead, and later on we can eliminate zk totally when we build 
 heartbeat between master and regionserver.
 I know this might looks like a very crazy re-architect, but it deserves deep 
 thinking and serious discussion for it, right?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10296) Replace ZK with a paxos running within master processes to provide better master failover performance and state consistency

2014-01-11 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868802#comment-13868802
 ] 

Feng Honghua commented on HBASE-10296:
--

[~lhofhansl] / [~apurtell] / [~ste...@apache.org] / [~e...@apache.org] : 
1. paxos / raft / zab library extracted from ZK are all good candidates :-)
2. I agree that implementing a *correct* consensus protocal for production 
usage is extremely hard, that's why I tagged this jira's type as brainstorming. 
my intention is to raise it to discuss what's a better / more reasonable 
architect would look like.
3. If we finally all agree on a better architect/design after 
analysis/discussion/proof, we can approach it in an conservative and 
incremental way, maybe eventually someday we make it :-)

 Replace ZK with a paxos running within master processes to provide better 
 master failover performance and state consistency
 ---

 Key: HBASE-10296
 URL: https://issues.apache.org/jira/browse/HBASE-10296
 Project: HBase
  Issue Type: Brainstorming
  Components: master, Region Assignment, regionserver
Reporter: Feng Honghua

 Currently master relies on ZK to elect active master, monitor liveness and 
 store almost all of its states, such as region states, table info, 
 replication info and so on. And zk also plays as a channel for 
 master-regionserver communication(such as in region assigning) and 
 client-regionserver communication(such as replication state/behavior change). 
 But zk as a communication channel is fragile due to its one-time watch and 
 asynchronous notification mechanism which together can leads to missed 
 events(hence missed messages), for example the master must rely on the state 
 transition logic's idempotence to maintain the region assigning state 
 machine's correctness, actually almost all of the most tricky inconsistency 
 issues can trace back their root cause to the fragility of zk as a 
 communication channel.
 Replace zk with paxos running within master processes have following benefits:
 1. better master failover performance: all master, either the active or the 
 standby ones, have the same latest states in memory(except lag ones but which 
 can eventually catch up later on). whenever the active master dies, the newly 
 elected active master can immediately play its role without such failover 
 work as building its in-memory states by consulting meta-table and zk.
 2. better state consistency: master's in-memory states are the only truth 
 about the system,which can eliminate inconsistency from the very beginning. 
 and though the states are contained by all masters, paxos guarantees they are 
 identical at any time.
 3. more direct and simple communication pattern: client changes state by 
 sending requests to master, master and regionserver talk directly to each 
 other by sending request and response...all don't bother to using a 
 third-party storage like zk which can introduce more uncertainty, worse 
 latency and more complexity.
 4. zk can only be used as liveness monitoring for determining if a 
 regionserver is dead, and later on we can eliminate zk totally when we build 
 heartbeat between master and regionserver.
 I know this might looks like a very crazy re-architect, but it deserves deep 
 thinking and serious discussion for it, right?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10296) Replace ZK with a paxos running within master processes to provide better master failover performance and state consistency

2014-01-11 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868803#comment-13868803
 ] 

Feng Honghua commented on HBASE-10296:
--

thanks all guys for the questions/directing/material/history-notes, really 
appreciated :-)

 Replace ZK with a paxos running within master processes to provide better 
 master failover performance and state consistency
 ---

 Key: HBASE-10296
 URL: https://issues.apache.org/jira/browse/HBASE-10296
 Project: HBase
  Issue Type: Brainstorming
  Components: master, Region Assignment, regionserver
Reporter: Feng Honghua

 Currently master relies on ZK to elect active master, monitor liveness and 
 store almost all of its states, such as region states, table info, 
 replication info and so on. And zk also plays as a channel for 
 master-regionserver communication(such as in region assigning) and 
 client-regionserver communication(such as replication state/behavior change). 
 But zk as a communication channel is fragile due to its one-time watch and 
 asynchronous notification mechanism which together can leads to missed 
 events(hence missed messages), for example the master must rely on the state 
 transition logic's idempotence to maintain the region assigning state 
 machine's correctness, actually almost all of the most tricky inconsistency 
 issues can trace back their root cause to the fragility of zk as a 
 communication channel.
 Replace zk with paxos running within master processes have following benefits:
 1. better master failover performance: all master, either the active or the 
 standby ones, have the same latest states in memory(except lag ones but which 
 can eventually catch up later on). whenever the active master dies, the newly 
 elected active master can immediately play its role without such failover 
 work as building its in-memory states by consulting meta-table and zk.
 2. better state consistency: master's in-memory states are the only truth 
 about the system,which can eliminate inconsistency from the very beginning. 
 and though the states are contained by all masters, paxos guarantees they are 
 identical at any time.
 3. more direct and simple communication pattern: client changes state by 
 sending requests to master, master and regionserver talk directly to each 
 other by sending request and response...all don't bother to using a 
 third-party storage like zk which can introduce more uncertainty, worse 
 latency and more complexity.
 4. zk can only be used as liveness monitoring for determining if a 
 regionserver is dead, and later on we can eliminate zk totally when we build 
 heartbeat between master and regionserver.
 I know this might looks like a very crazy re-architect, but it deserves deep 
 thinking and serious discussion for it, right?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10295) Refactor the replication implementation to eliminate permanent zk node

2014-01-11 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868811#comment-13868811
 ] 

Feng Honghua commented on HBASE-10295:
--

[~stack] Yes, sound feasible, thanks :-)

 Refactor the replication  implementation to eliminate permanent zk node
 ---

 Key: HBASE-10295
 URL: https://issues.apache.org/jira/browse/HBASE-10295
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: Feng Honghua
Assignee: Feng Honghua
 Fix For: 0.99.0


 Though this is a broader and bigger change, it original motivation derives 
 from [HBASE-8751|https://issues.apache.org/jira/browse/HBASE-8751]: the newly 
 introduced per-peer tableCFs attribute should be treated the same way as the 
 peer-state, which is a permanent sub-node under peer node but using permanent 
 zk node is deemed as an incorrect practice. So let's refactor to eliminate 
 the permanent zk node. And the HBASE-8751 can then align its newly introduced 
 per-peer tableCFs attribute with this *correct* implementation theme.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9203) Secondary index support through coprocessors

2014-01-11 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868812#comment-13868812
 ] 

Anoop Sam John commented on HBASE-9203:
---

I think instead of the padding approach, can change this to having a separator 
byte. (0 byte) .. That should work out.

 Secondary index support through coprocessors
 

 Key: HBASE-9203
 URL: https://issues.apache.org/jira/browse/HBASE-9203
 Project: HBase
  Issue Type: New Feature
Reporter: rajeshbabu
Assignee: rajeshbabu
 Attachments: SecondaryIndex Design.pdf, SecondaryIndex 
 Design_Updated.pdf, SecondaryIndex Design_Updated_2.pdf


 We have been working on implementing secondary index in HBase and open 
 sourced  on hbase 0.94.8 version.
 The project is available on github.
 https://github.com/Huawei-Hadoop/hindex
 This Jira is to support secondary index on trunk(0.98).
 Following features will be supported.
 -  multiple indexes on table,
 -  multi column index,
 -  index based on part of a column value,
 -  equals and range condition scans using index, and
 -  bulk loading data to indexed table (Indexing done with bulk load)
 Most of the kernel changes needed for secondary index is available in trunk. 
 Very minimal changes needed for it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10310) ZNodeCleaner session expired for /hbase/master

2014-01-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868819#comment-13868819
 ] 

Hudson commented on HBASE-10310:


SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-1.1 #50 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/50/])
HBASE-10310. ZNodeCleaner session expired for /hbase/master (Samir Ahmic) 
(apurtell: rev 1557273)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ZNodeClearer.java


 ZNodeCleaner session expired for /hbase/master
 --

 Key: HBASE-10310
 URL: https://issues.apache.org/jira/browse/HBASE-10310
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.96.1.1
 Environment: x86_64 GNU/Linux
Reporter: Samir Ahmic
Assignee: Samir Ahmic
 Fix For: 0.98.0, 0.96.2, 0.99.0

 Attachments: HBASE-10310.patch


 I was testing hbase master clear command while working on [HBASE-7386] here 
 is command and exception:
 {code}
 $ export HBASE_ZNODE_FILE=/tmp/hbase-hadoop-master.znode; ./hbase master clear
 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Initiating client connection, 
 connectString=zk1:2181 sessionTimeout=9 watcher=clean znode for master, 
 quorum=zk1:2181, baseZNode=/hbase
 14/01/10 14:05:44 INFO zookeeper.RecoverableZooKeeper: Process 
 identifier=clean znode for master connecting to ZooKeeper ensemble=zk1:2181
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Opening socket connection to 
 server zk1/172.17.33.5:2181. Will not attempt to authenticate using SASL 
 (Unable to locate a login configuration)
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Socket connection established to 
 zk11/172.17.33.5:2181, initiating session
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: Session establishment complete 
 on server zk1/172.17.33.5:2181, sessionid = 0x1427a96bfea4a8a, negotiated 
 timeout = 4
 14/01/10 14:05:44 INFO zookeeper.ZooKeeper: Session: 0x1427a96bfea4a8a closed
 14/01/10 14:05:44 INFO zookeeper.ClientCnxn: EventThread shut down
 14/01/10 14:05:44 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper, quorum=zk1:2181, 
 exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
 KeeperErrorCode = Session expired for /hbase/master
 14/01/10 14:05:44 INFO util.RetryCounter: Sleeping 1000ms before retry #0...
 14/01/10 14:05:45 WARN zookeeper.RecoverableZooKeeper: Possibly transient 
 ZooKeeper, quorum=zk1:2181, 
 exception=org.apache.zookeeper.KeeperException$SessionExpiredException: 
 KeeperErrorCode = Session expired for /hbase/master
 14/01/10 14:05:45 ERROR zookeeper.RecoverableZooKeeper: ZooKeeper getData 
 failed after 1 attempts
 14/01/10 14:05:45 WARN zookeeper.ZKUtil: clean znode for 
 master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Unable to get 
 data of znode /hbase/master
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase/master
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
   at 
 org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
   at org.apache.hadoop.hbase.ZNodeClearer.clear(ZNodeClearer.java:160)
   at 
 org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:138)
   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
   at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
   at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2779)
 14/01/10 14:05:45 ERROR zookeeper.ZooKeeperWatcher: clean znode for 
 master-0x1427a96bfea4a8a, quorum=zk1:2181, baseZNode=/hbase Received 
 unexpected KeeperException, re-throwing exception
 org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode 
 = Session expired for /hbase/master
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
   at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151)
   at 
 org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:337)
   at 
 org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:777)
   at 
 org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.deleteIfEquals(MasterAddressTracker.java:170)
   at 

[jira] [Commented] (HBASE-10318) generate-hadoopX-poms.sh expects the version to have one extra '-'

2014-01-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868821#comment-13868821
 ] 

Hudson commented on HBASE-10318:


SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-1.1 #50 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/50/])
HBASE-10318. generate-hadoopX-poms.sh expects the version to have one extra '-' 
(Raja Aluri) (apurtell: rev 1557301)
* /hbase/trunk/dev-support/generate-hadoopX-poms.sh


 generate-hadoopX-poms.sh expects the version to have one extra '-'
 --

 Key: HBASE-10318
 URL: https://issues.apache.org/jira/browse/HBASE-10318
 Project: HBase
  Issue Type: Bug
  Components: build
Affects Versions: 0.98.0
Reporter: Raja Aluri
Assignee: Raja Aluri
 Fix For: 0.98.0, 0.99.0

 Attachments: 
 0001-HBASE-10318-generate-hadoopX-poms.sh-expects-the-ver.patch


 This change is in 0.96 branch, but missing in 0.98.
 Including the commit that made this 
 [change|https://github.com/apache/hbase/commit/09442ca]



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10265) Upgrade to commons-logging 1.1.3

2014-01-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868820#comment-13868820
 ] 

Hudson commented on HBASE-10265:


SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-1.1 #50 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/50/])
HBASE-10265 Upgrade to commons-logging 1.1.3 (liangxie: rev 1557299)
* /hbase/trunk/pom.xml


 Upgrade to commons-logging 1.1.3
 

 Key: HBASE-10265
 URL: https://issues.apache.org/jira/browse/HBASE-10265
 Project: HBase
  Issue Type: Improvement
  Components: build
Affects Versions: 0.99.0
Reporter: Liang Xie
Assignee: Liang Xie
 Fix For: 0.99.0

 Attachments: HBASE-10265.txt


 Per HADOOP-10147 and HDFS-5678, through we didn't observe any deadlock due to 
 common-logging in HBase, to me, it's still worth to bump the version.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-9203) Secondary index support through coprocessors

2014-01-11 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-9203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868831#comment-13868831
 ] 

ramkrishna.s.vasudevan commented on HBASE-9203:
---

bout scan(e.g. there's a filter on the indexed column) performance. how to 
decide or evaluate when we do the query into user table directly and when we 
do the query into index table first then do the (multi-)get into user table ?
You mean the dynamic decision whether to use index or not?  Should that be user 
decision?

 Secondary index support through coprocessors
 

 Key: HBASE-9203
 URL: https://issues.apache.org/jira/browse/HBASE-9203
 Project: HBase
  Issue Type: New Feature
Reporter: rajeshbabu
Assignee: rajeshbabu
 Attachments: SecondaryIndex Design.pdf, SecondaryIndex 
 Design_Updated.pdf, SecondaryIndex Design_Updated_2.pdf


 We have been working on implementing secondary index in HBase and open 
 sourced  on hbase 0.94.8 version.
 The project is available on github.
 https://github.com/Huawei-Hadoop/hindex
 This Jira is to support secondary index on trunk(0.98).
 Following features will be supported.
 -  multiple indexes on table,
 -  multi column index,
 -  index based on part of a column value,
 -  equals and range condition scans using index, and
 -  bulk loading data to indexed table (Indexing done with bulk load)
 Most of the kernel changes needed for secondary index is available in trunk. 
 Very minimal changes needed for it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10292) TestRegionServerCoprocessorExceptionWithAbort fails occasionally

2014-01-11 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868852#comment-13868852
 ] 

Andrew Purtell commented on HBASE-10292:


I am going to commit the addendum momentarily because after 30 executions of 
the complete unit test suite on the test box where this test was failing, now 
there are zero test failures.

 TestRegionServerCoprocessorExceptionWithAbort fails occasionally
 

 Key: HBASE-10292
 URL: https://issues.apache.org/jira/browse/HBASE-10292
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.99.0

 Attachments: 10292-addendum-1.patch, 10292.patch, 10292.patch


 TestRegionServerCoprocessorExceptionWithAbort has occasionally failed for a 
 very long time now. Fix or disable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10277) refactor AsyncProcess

2014-01-11 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868855#comment-13868855
 ] 

Andrew Purtell commented on HBASE-10277:


bq. The (ugly) behavior for HTable where e.g. next put will give you errors 
from previous put was lovingly preserved.

To my mind this is a serious problem because the application is getting 
incorrect feedback on what operation actually failed.

 refactor AsyncProcess
 -

 Key: HBASE-10277
 URL: https://issues.apache.org/jira/browse/HBASE-10277
 Project: HBase
  Issue Type: Improvement
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HBASE-10277.patch


 AsyncProcess currently has two patterns of usage, one from HTable flush w/o 
 callback and with reuse, and one from HCM/HTable batch call, with callback 
 and w/o reuse. In the former case (but not the latter), it also does some 
 throttling of actions on initial submit call, limiting the number of 
 outstanding actions per server.
 The latter case is relatively straightforward. The former appears to be error 
 prone due to reuse - if, as javadoc claims should be safe, multiple submit 
 calls are performed without waiting for the async part of the previous call 
 to finish, fields like hasError become ambiguous and can be used for the 
 wrong call; callback for success/failure is called based on original index 
 of an action in submitted list, but with only one callback supplied to AP in 
 ctor it's not clear to which submit call the index belongs, if several are 
 outstanding.
 I was going to add support for HBASE-10070 to AP, and found that it might be 
 difficult to do cleanly.
 It would be nice to normalize AP usage patterns; in particular, separate the 
 global part (load tracking) from per-submit-call part.
 Per-submit part can more conveniently track stuff like initialActions, 
 mapping of indexes and retry information, that is currently passed around the 
 method calls.
 I am not sure yet, but maybe sending of the original index to server in 
 ClientProtos.MultiAction can also be avoided.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HBASE-10292) TestRegionServerCoprocessorExceptionWithAbort fails occasionally

2014-01-11 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell resolved HBASE-10292.


Resolution: Fixed

 TestRegionServerCoprocessorExceptionWithAbort fails occasionally
 

 Key: HBASE-10292
 URL: https://issues.apache.org/jira/browse/HBASE-10292
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.99.0

 Attachments: 10292-addendum-1.patch, 10292.patch, 10292.patch


 TestRegionServerCoprocessorExceptionWithAbort has occasionally failed for a 
 very long time now. Fix or disable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10292) TestRegionServerCoprocessorExceptionWithAbort fails occasionally

2014-01-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868885#comment-13868885
 ] 

Hudson commented on HBASE-10292:


FAILURE: Integrated in HBase-0.98 #72 (See 
[https://builds.apache.org/job/HBase-0.98/72/])
Amend HBASE-10292. TestRegionServerCoprocessorExceptionWithAbort fails 
occasionally (apurtell: rev 1557438)
* 
/hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithAbort.java
* 
/hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithRemove.java


 TestRegionServerCoprocessorExceptionWithAbort fails occasionally
 

 Key: HBASE-10292
 URL: https://issues.apache.org/jira/browse/HBASE-10292
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.99.0

 Attachments: 10292-addendum-1.patch, 10292.patch, 10292.patch


 TestRegionServerCoprocessorExceptionWithAbort has occasionally failed for a 
 very long time now. Fix or disable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10296) Replace ZK with a paxos running within master processes to provide better master failover performance and state consistency

2014-01-11 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868890#comment-13868890
 ] 

Lars Hofhansl commented on HBASE-10296:
---

I always thought that having processes participate in the

 Replace ZK with a paxos running within master processes to provide better 
 master failover performance and state consistency
 ---

 Key: HBASE-10296
 URL: https://issues.apache.org/jira/browse/HBASE-10296
 Project: HBase
  Issue Type: Brainstorming
  Components: master, Region Assignment, regionserver
Reporter: Feng Honghua

 Currently master relies on ZK to elect active master, monitor liveness and 
 store almost all of its states, such as region states, table info, 
 replication info and so on. And zk also plays as a channel for 
 master-regionserver communication(such as in region assigning) and 
 client-regionserver communication(such as replication state/behavior change). 
 But zk as a communication channel is fragile due to its one-time watch and 
 asynchronous notification mechanism which together can leads to missed 
 events(hence missed messages), for example the master must rely on the state 
 transition logic's idempotence to maintain the region assigning state 
 machine's correctness, actually almost all of the most tricky inconsistency 
 issues can trace back their root cause to the fragility of zk as a 
 communication channel.
 Replace zk with paxos running within master processes have following benefits:
 1. better master failover performance: all master, either the active or the 
 standby ones, have the same latest states in memory(except lag ones but which 
 can eventually catch up later on). whenever the active master dies, the newly 
 elected active master can immediately play its role without such failover 
 work as building its in-memory states by consulting meta-table and zk.
 2. better state consistency: master's in-memory states are the only truth 
 about the system,which can eliminate inconsistency from the very beginning. 
 and though the states are contained by all masters, paxos guarantees they are 
 identical at any time.
 3. more direct and simple communication pattern: client changes state by 
 sending requests to master, master and regionserver talk directly to each 
 other by sending request and response...all don't bother to using a 
 third-party storage like zk which can introduce more uncertainty, worse 
 latency and more complexity.
 4. zk can only be used as liveness monitoring for determining if a 
 regionserver is dead, and later on we can eliminate zk totally when we build 
 heartbeat between master and regionserver.
 I know this might looks like a very crazy re-architect, but it deserves deep 
 thinking and serious discussion for it, right?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Comment Edited] (HBASE-10296) Replace ZK with a paxos running within master processes to provide better master failover performance and state consistency

2014-01-11 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868890#comment-13868890
 ] 

Lars Hofhansl edited comment on HBASE-10296 at 1/11/14 9:59 PM:


I always thought that having processes participate in the coordination process 
directly (as group members) rather than using an external group membership 
would be better, which I was very disappointed when I first looked at ZK that 
ZAB was buried to deeply in with the rest of ZK.

ZK on the other hand is simple (because somebody else solved the hard problems 
for us). So I can see this go both ways.

On some level that ties into the discussion as to why we have master and 
regionserver roles. Cannot all servers serve both roles as needed?



was (Author: lhofhansl):
I always thought that having processes participate in the

 Replace ZK with a paxos running within master processes to provide better 
 master failover performance and state consistency
 ---

 Key: HBASE-10296
 URL: https://issues.apache.org/jira/browse/HBASE-10296
 Project: HBase
  Issue Type: Brainstorming
  Components: master, Region Assignment, regionserver
Reporter: Feng Honghua

 Currently master relies on ZK to elect active master, monitor liveness and 
 store almost all of its states, such as region states, table info, 
 replication info and so on. And zk also plays as a channel for 
 master-regionserver communication(such as in region assigning) and 
 client-regionserver communication(such as replication state/behavior change). 
 But zk as a communication channel is fragile due to its one-time watch and 
 asynchronous notification mechanism which together can leads to missed 
 events(hence missed messages), for example the master must rely on the state 
 transition logic's idempotence to maintain the region assigning state 
 machine's correctness, actually almost all of the most tricky inconsistency 
 issues can trace back their root cause to the fragility of zk as a 
 communication channel.
 Replace zk with paxos running within master processes have following benefits:
 1. better master failover performance: all master, either the active or the 
 standby ones, have the same latest states in memory(except lag ones but which 
 can eventually catch up later on). whenever the active master dies, the newly 
 elected active master can immediately play its role without such failover 
 work as building its in-memory states by consulting meta-table and zk.
 2. better state consistency: master's in-memory states are the only truth 
 about the system,which can eliminate inconsistency from the very beginning. 
 and though the states are contained by all masters, paxos guarantees they are 
 identical at any time.
 3. more direct and simple communication pattern: client changes state by 
 sending requests to master, master and regionserver talk directly to each 
 other by sending request and response...all don't bother to using a 
 third-party storage like zk which can introduce more uncertainty, worse 
 latency and more complexity.
 4. zk can only be used as liveness monitoring for determining if a 
 regionserver is dead, and later on we can eliminate zk totally when we build 
 heartbeat between master and regionserver.
 I know this might looks like a very crazy re-architect, but it deserves deep 
 thinking and serious discussion for it, right?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10292) TestRegionServerCoprocessorExceptionWithAbort fails occasionally

2014-01-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868893#comment-13868893
 ] 

Hudson commented on HBASE-10292:


FAILURE: Integrated in HBase-TRUNK #4808 (See 
[https://builds.apache.org/job/HBase-TRUNK/4808/])
Amend HBASE-10292. TestRegionServerCoprocessorExceptionWithAbort fails 
occasionally (apurtell: rev 1557436)
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithAbort.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithRemove.java


 TestRegionServerCoprocessorExceptionWithAbort fails occasionally
 

 Key: HBASE-10292
 URL: https://issues.apache.org/jira/browse/HBASE-10292
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.99.0

 Attachments: 10292-addendum-1.patch, 10292.patch, 10292.patch


 TestRegionServerCoprocessorExceptionWithAbort has occasionally failed for a 
 very long time now. Fix or disable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10292) TestRegionServerCoprocessorExceptionWithAbort fails occasionally

2014-01-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868899#comment-13868899
 ] 

Hudson commented on HBASE-10292:


SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #67 (See 
[https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/67/])
Amend HBASE-10292. TestRegionServerCoprocessorExceptionWithAbort fails 
occasionally (apurtell: rev 1557438)
* 
/hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithAbort.java
* 
/hbase/branches/0.98/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithRemove.java


 TestRegionServerCoprocessorExceptionWithAbort fails occasionally
 

 Key: HBASE-10292
 URL: https://issues.apache.org/jira/browse/HBASE-10292
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.99.0

 Attachments: 10292-addendum-1.patch, 10292.patch, 10292.patch


 TestRegionServerCoprocessorExceptionWithAbort has occasionally failed for a 
 very long time now. Fix or disable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10320) Avoid ArrayList.iterator() in tight loops

2014-01-11 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-10320:
-

 Summary: Avoid ArrayList.iterator() in tight loops
 Key: HBASE-10320
 URL: https://issues.apache.org/jira/browse/HBASE-10320
 Project: HBase
  Issue Type: Bug
  Components: Performance
Reporter: Lars Hofhansl


I noticed that in a profiler (sampler) run ScanQueryMatcher.setRow(...) showed 
up at all.
In turns out that the expensive part is iterating over the columns in 
ExcplicitColumnTracker.reset(). I did some microbenchmarks and found that
{code}
private ArrayListX l;
...
for (int i=0; il.size(); i++) {
   X = l.get(i);
   ...
}
{code}
Is twice as fast than:
{code}
private ArrayListX l;
...
for (X : l) {
   ...
}
{code}

The indexed version asymptotically approaches the iterator version, but even at 
1m entries it is still faster.
In my tight loop scans this provides for a 5% performance improvement overall 
when the ExcplicitColumnTracker is used.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10320) Avoid ArrayList.iterator() in tight loops

2014-01-11 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-10320:
--

Attachment: 10320-0.94.txt

Simple patch for 0.94.
The
{code}
-  private final ListColumnCount columns;
+  private final ArrayListColumnCount columns;
{code}

is not needed, but it makes it explicit that we're addressing this list via 
random access.

 Avoid ArrayList.iterator() in tight loops
 -

 Key: HBASE-10320
 URL: https://issues.apache.org/jira/browse/HBASE-10320
 Project: HBase
  Issue Type: Bug
  Components: Performance
Reporter: Lars Hofhansl
 Attachments: 10320-0.94.txt


 I noticed that in a profiler (sampler) run ScanQueryMatcher.setRow(...) 
 showed up at all.
 In turns out that the expensive part is iterating over the columns in 
 ExcplicitColumnTracker.reset(). I did some microbenchmarks and found that
 {code}
 private ArrayListX l;
 ...
 for (int i=0; il.size(); i++) {
X = l.get(i);
...
 }
 {code}
 Is twice as fast than:
 {code}
 private ArrayListX l;
 ...
 for (X : l) {
...
 }
 {code}
 The indexed version asymptotically approaches the iterator version, but even 
 at 1m entries it is still faster.
 In my tight loop scans this provides for a 5% performance improvement overall 
 when the ExcplicitColumnTracker is used.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10292) TestRegionServerCoprocessorExceptionWithAbort fails occasionally

2014-01-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868918#comment-13868918
 ] 

Hudson commented on HBASE-10292:


SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-1.1 #51 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-1.1/51/])
Amend HBASE-10292. TestRegionServerCoprocessorExceptionWithAbort fails 
occasionally (apurtell: rev 1557436)
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithAbort.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionServerCoprocessorExceptionWithRemove.java


 TestRegionServerCoprocessorExceptionWithAbort fails occasionally
 

 Key: HBASE-10292
 URL: https://issues.apache.org/jira/browse/HBASE-10292
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Andrew Purtell
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.99.0

 Attachments: 10292-addendum-1.patch, 10292.patch, 10292.patch


 TestRegionServerCoprocessorExceptionWithAbort has occasionally failed for a 
 very long time now. Fix or disable.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-6581) Build with hadoop.profile=3.0

2014-01-11 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868941#comment-13868941
 ] 

Ted Yu commented on HBASE-6581:
---

Looks like patch v6 was generated incorrecly: hadoop-three-compat.xml should be 
a new file.
{code}
--- a/hbase-assembly/src/main/assembly/hadoop-three-compat.xml
+++ /dev/null
@@ -1,46 +0,0 @@
-?xml version=1.0?
{code}

 Build with hadoop.profile=3.0
 -

 Key: HBASE-6581
 URL: https://issues.apache.org/jira/browse/HBASE-6581
 Project: HBase
  Issue Type: Bug
Reporter: Eric Charles
Assignee: Eric Charles
 Fix For: 0.98.1

 Attachments: HBASE-6581-1.patch, HBASE-6581-2.patch, 
 HBASE-6581-20130821.patch, HBASE-6581-3.patch, HBASE-6581-4.patch, 
 HBASE-6581-5.patch, HBASE-6581-6.patch, HBASE-6581.diff, HBASE-6581.diff


 Building trunk with hadoop.profile=3.0 gives exceptions (see [1]) due to 
 change in the hadoop maven modules naming (and also usage of 3.0-SNAPSHOT 
 instead of 3.0.0-SNAPSHOT in hbase-common).
 I can provide a patch that would move most of hadoop dependencies in their 
 respective profiles and will define the correct hadoop deps in the 3.0 
 profile.
 Please tell me if that's ok to go this way.
 Thx, Eric
 [1]
 $ mvn clean install -Dhadoop.profile=3.0
 [INFO] Scanning for projects...
 [ERROR] The build could not read 3 projects - [Help 1]
 [ERROR]   
 [ERROR]   The project org.apache.hbase:hbase-server:0.95-SNAPSHOT 
 (/d/hbase.svn/hbase-server/pom.xml) has 3 errors
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-common:jar is missing. @ line 655, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-annotations:jar is missing. @ line 659, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 663, column 21
 [ERROR]   
 [ERROR]   The project org.apache.hbase:hbase-common:0.95-SNAPSHOT 
 (/d/hbase.svn/hbase-common/pom.xml) has 3 errors
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-common:jar is missing. @ line 170, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-annotations:jar is missing. @ line 174, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 178, column 21
 [ERROR]   
 [ERROR]   The project org.apache.hbase:hbase-it:0.95-SNAPSHOT 
 (/d/hbase.svn/hbase-it/pom.xml) has 3 errors
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-common:jar is missing. @ line 220, column 18
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-annotations:jar is missing. @ line 224, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 228, column 21
 [ERROR] 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10296) Replace ZK with a paxos running within master processes to provide better master failover performance and state consistency

2014-01-11 Thread Eric Charles (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868947#comment-13868947
 ] 

Eric Charles commented on HBASE-10296:
--

I think the issue is that zk is just a half solution. It is a coordination 
util, but the job is still to be done. For now, it the coordination logic 
mainly done in the hbase code (a bit everywhere I think, there is no 
'coordination' package, no separation of concern)
To evolve, there are 2 directions:
1. Embed the coordination with a protocol where the coordination is built-in 
(zab, p axos or whatever).
2. Move the coordination out of the hbase code to an external layer. Zk is not 
enough, would Helix (which relies on Zk) be a good fit ?


 Replace ZK with a paxos running within master processes to provide better 
 master failover performance and state consistency
 ---

 Key: HBASE-10296
 URL: https://issues.apache.org/jira/browse/HBASE-10296
 Project: HBase
  Issue Type: Brainstorming
  Components: master, Region Assignment, regionserver
Reporter: Feng Honghua

 Currently master relies on ZK to elect active master, monitor liveness and 
 store almost all of its states, such as region states, table info, 
 replication info and so on. And zk also plays as a channel for 
 master-regionserver communication(such as in region assigning) and 
 client-regionserver communication(such as replication state/behavior change). 
 But zk as a communication channel is fragile due to its one-time watch and 
 asynchronous notification mechanism which together can leads to missed 
 events(hence missed messages), for example the master must rely on the state 
 transition logic's idempotence to maintain the region assigning state 
 machine's correctness, actually almost all of the most tricky inconsistency 
 issues can trace back their root cause to the fragility of zk as a 
 communication channel.
 Replace zk with paxos running within master processes have following benefits:
 1. better master failover performance: all master, either the active or the 
 standby ones, have the same latest states in memory(except lag ones but which 
 can eventually catch up later on). whenever the active master dies, the newly 
 elected active master can immediately play its role without such failover 
 work as building its in-memory states by consulting meta-table and zk.
 2. better state consistency: master's in-memory states are the only truth 
 about the system,which can eliminate inconsistency from the very beginning. 
 and though the states are contained by all masters, paxos guarantees they are 
 identical at any time.
 3. more direct and simple communication pattern: client changes state by 
 sending requests to master, master and regionserver talk directly to each 
 other by sending request and response...all don't bother to using a 
 third-party storage like zk which can introduce more uncertainty, worse 
 latency and more complexity.
 4. zk can only be used as liveness monitoring for determining if a 
 regionserver is dead, and later on we can eliminate zk totally when we build 
 heartbeat between master and regionserver.
 I know this might looks like a very crazy re-architect, but it deserves deep 
 thinking and serious discussion for it, right?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-6581) Build with hadoop.profile=3.0

2014-01-11 Thread Eric Charles (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Charles updated HBASE-6581:


Attachment: HBASE-6581-7.patch

[~ted_yu] Thx for the review and sorry, I messed my git command.

Attached another v7.
I did successfully
svn co https://svn.apache.org/repos/asf/hbase/trunk
patch -p0  HBASE-6581-7.patch
mvn clean install -DskipTests -Dhadoop.profile=3.0$

The generated hbase dist works on a hadoop (trunk) cluster.

 Build with hadoop.profile=3.0
 -

 Key: HBASE-6581
 URL: https://issues.apache.org/jira/browse/HBASE-6581
 Project: HBase
  Issue Type: Bug
Reporter: Eric Charles
Assignee: Eric Charles
 Fix For: 0.98.1

 Attachments: HBASE-6581-1.patch, HBASE-6581-2.patch, 
 HBASE-6581-20130821.patch, HBASE-6581-3.patch, HBASE-6581-4.patch, 
 HBASE-6581-5.patch, HBASE-6581-6.patch, HBASE-6581-7.patch, HBASE-6581.diff, 
 HBASE-6581.diff


 Building trunk with hadoop.profile=3.0 gives exceptions (see [1]) due to 
 change in the hadoop maven modules naming (and also usage of 3.0-SNAPSHOT 
 instead of 3.0.0-SNAPSHOT in hbase-common).
 I can provide a patch that would move most of hadoop dependencies in their 
 respective profiles and will define the correct hadoop deps in the 3.0 
 profile.
 Please tell me if that's ok to go this way.
 Thx, Eric
 [1]
 $ mvn clean install -Dhadoop.profile=3.0
 [INFO] Scanning for projects...
 [ERROR] The build could not read 3 projects - [Help 1]
 [ERROR]   
 [ERROR]   The project org.apache.hbase:hbase-server:0.95-SNAPSHOT 
 (/d/hbase.svn/hbase-server/pom.xml) has 3 errors
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-common:jar is missing. @ line 655, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-annotations:jar is missing. @ line 659, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 663, column 21
 [ERROR]   
 [ERROR]   The project org.apache.hbase:hbase-common:0.95-SNAPSHOT 
 (/d/hbase.svn/hbase-common/pom.xml) has 3 errors
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-common:jar is missing. @ line 170, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-annotations:jar is missing. @ line 174, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 178, column 21
 [ERROR]   
 [ERROR]   The project org.apache.hbase:hbase-it:0.95-SNAPSHOT 
 (/d/hbase.svn/hbase-it/pom.xml) has 3 errors
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-common:jar is missing. @ line 220, column 18
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-annotations:jar is missing. @ line 224, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 228, column 21
 [ERROR] 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-6581) Build with hadoop.profile=3.0

2014-01-11 Thread Eric Charles (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868950#comment-13868950
 ] 

Eric Charles commented on HBASE-6581:
-

[~ted_yu] oops, master does not start today. I will check and fix. you can 
already review the pach and see if it compiles with -Dhadoop.profile=3.0

 Build with hadoop.profile=3.0
 -

 Key: HBASE-6581
 URL: https://issues.apache.org/jira/browse/HBASE-6581
 Project: HBase
  Issue Type: Bug
Reporter: Eric Charles
Assignee: Eric Charles
 Fix For: 0.98.1

 Attachments: HBASE-6581-1.patch, HBASE-6581-2.patch, 
 HBASE-6581-20130821.patch, HBASE-6581-3.patch, HBASE-6581-4.patch, 
 HBASE-6581-5.patch, HBASE-6581-6.patch, HBASE-6581-7.patch, HBASE-6581.diff, 
 HBASE-6581.diff


 Building trunk with hadoop.profile=3.0 gives exceptions (see [1]) due to 
 change in the hadoop maven modules naming (and also usage of 3.0-SNAPSHOT 
 instead of 3.0.0-SNAPSHOT in hbase-common).
 I can provide a patch that would move most of hadoop dependencies in their 
 respective profiles and will define the correct hadoop deps in the 3.0 
 profile.
 Please tell me if that's ok to go this way.
 Thx, Eric
 [1]
 $ mvn clean install -Dhadoop.profile=3.0
 [INFO] Scanning for projects...
 [ERROR] The build could not read 3 projects - [Help 1]
 [ERROR]   
 [ERROR]   The project org.apache.hbase:hbase-server:0.95-SNAPSHOT 
 (/d/hbase.svn/hbase-server/pom.xml) has 3 errors
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-common:jar is missing. @ line 655, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-annotations:jar is missing. @ line 659, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 663, column 21
 [ERROR]   
 [ERROR]   The project org.apache.hbase:hbase-common:0.95-SNAPSHOT 
 (/d/hbase.svn/hbase-common/pom.xml) has 3 errors
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-common:jar is missing. @ line 170, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-annotations:jar is missing. @ line 174, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 178, column 21
 [ERROR]   
 [ERROR]   The project org.apache.hbase:hbase-it:0.95-SNAPSHOT 
 (/d/hbase.svn/hbase-it/pom.xml) has 3 errors
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-common:jar is missing. @ line 220, column 18
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-annotations:jar is missing. @ line 224, column 21
 [ERROR] 'dependencies.dependency.version' for 
 org.apache.hadoop:hadoop-minicluster:jar is missing. @ line 228, column 21
 [ERROR] 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10320) Avoid ArrayList.iterator() in tight loops

2014-01-11 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868954#comment-13868954
 ] 

Anoop Sam John commented on HBASE-10320:


Interesting finding Lars.  +1 on the patch if it gives improvement..  

 Avoid ArrayList.iterator() in tight loops
 -

 Key: HBASE-10320
 URL: https://issues.apache.org/jira/browse/HBASE-10320
 Project: HBase
  Issue Type: Bug
  Components: Performance
Reporter: Lars Hofhansl
 Attachments: 10320-0.94.txt


 I noticed that in a profiler (sampler) run ScanQueryMatcher.setRow(...) 
 showed up at all.
 In turns out that the expensive part is iterating over the columns in 
 ExcplicitColumnTracker.reset(). I did some microbenchmarks and found that
 {code}
 private ArrayListX l;
 ...
 for (int i=0; il.size(); i++) {
X = l.get(i);
...
 }
 {code}
 Is twice as fast than:
 {code}
 private ArrayListX l;
 ...
 for (X : l) {
...
 }
 {code}
 The indexed version asymptotically approaches the iterator version, but even 
 at 1m entries it is still faster.
 In my tight loop scans this provides for a 5% performance improvement overall 
 when the ExcplicitColumnTracker is used.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10321) CellCodec has broken the 96 client to 98 server compatibility

2014-01-11 Thread Anoop Sam John (JIRA)
Anoop Sam John created HBASE-10321:
--

 Summary: CellCodec has broken the 96 client to 98 server 
compatibility
 Key: HBASE-10321
 URL: https://issues.apache.org/jira/browse/HBASE-10321
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.98.0, 0.99.0


The write/read tags added in CellCodec has broken the 96 client to 98 server 
compatibility (and 98 client to 96 server)
When 96 client CellCodec writes cell, it wont write tags part at all. But the 
server expects a tag part, at least a 0 tag length. This tag length read will 
make a read of some bytes from next cell!

I suggest we can remove the tag part from CellCodec. This codec is not used by 
default and I don't think some one will change to CellCodec from the default 
KVCodec now. .. Still I feel we can solve it.

This makes tags not supported via CellCodec..Tag support can be added to 
CellCodec once we have Connection negotiation in place (?)




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10321) CellCodec has broken the 96 client to 98 server compatibility

2014-01-11 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-10321:
---

Attachment: HBASE-10321.patch

 CellCodec has broken the 96 client to 98 server compatibility
 -

 Key: HBASE-10321
 URL: https://issues.apache.org/jira/browse/HBASE-10321
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.98.0, 0.99.0

 Attachments: HBASE-10321.patch


 The write/read tags added in CellCodec has broken the 96 client to 98 server 
 compatibility (and 98 client to 96 server)
 When 96 client CellCodec writes cell, it wont write tags part at all. But the 
 server expects a tag part, at least a 0 tag length. This tag length read will 
 make a read of some bytes from next cell!
 I suggest we can remove the tag part from CellCodec. This codec is not used 
 by default and I don't think some one will change to CellCodec from the 
 default KVCodec now. .. Still I feel we can solve it.
 This makes tags not supported via CellCodec..Tag support can be added to 
 CellCodec once we have Connection negotiation in place (?)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10321) CellCodec has broken the 96 client to 98 server compatibility

2014-01-11 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-10321:
---

Status: Patch Available  (was: Open)

 CellCodec has broken the 96 client to 98 server compatibility
 -

 Key: HBASE-10321
 URL: https://issues.apache.org/jira/browse/HBASE-10321
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.98.0, 0.99.0

 Attachments: HBASE-10321.patch


 The write/read tags added in CellCodec has broken the 96 client to 98 server 
 compatibility (and 98 client to 96 server)
 When 96 client CellCodec writes cell, it won't write tags part at all. But 
 the server expects a tag part, at least a 0 tag length. This tag length read 
 will make a read of some bytes from next cell!
 I suggest we can remove the tag part from CellCodec. This codec is not used 
 by default and I don't think some one will change to CellCodec from the 
 default KVCodec now. ..
 This makes tags not supported via CellCodec..Tag support can be added to 
 CellCodec once we have Connection negotiation in place (?)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HBASE-10321) CellCodec has broken the 96 client to 98 server compatibility

2014-01-11 Thread Anoop Sam John (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-10321:
---

Description: 
The write/read tags added in CellCodec has broken the 96 client to 98 server 
compatibility (and 98 client to 96 server)
When 96 client CellCodec writes cell, it won't write tags part at all. But the 
server expects a tag part, at least a 0 tag length. This tag length read will 
make a read of some bytes from next cell!

I suggest we can remove the tag part from CellCodec. This codec is not used by 
default and I don't think some one will change to CellCodec from the default 
KVCodec now. ..

This makes tags not supported via CellCodec..Tag support can be added to 
CellCodec once we have Connection negotiation in place (?)


  was:
The write/read tags added in CellCodec has broken the 96 client to 98 server 
compatibility (and 98 client to 96 server)
When 96 client CellCodec writes cell, it wont write tags part at all. But the 
server expects a tag part, at least a 0 tag length. This tag length read will 
make a read of some bytes from next cell!

I suggest we can remove the tag part from CellCodec. This codec is not used by 
default and I don't think some one will change to CellCodec from the default 
KVCodec now. .. Still I feel we can solve it.

This makes tags not supported via CellCodec..Tag support can be added to 
CellCodec once we have Connection negotiation in place (?)



 CellCodec has broken the 96 client to 98 server compatibility
 -

 Key: HBASE-10321
 URL: https://issues.apache.org/jira/browse/HBASE-10321
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Critical
 Fix For: 0.98.0, 0.99.0

 Attachments: HBASE-10321.patch


 The write/read tags added in CellCodec has broken the 96 client to 98 server 
 compatibility (and 98 client to 96 server)
 When 96 client CellCodec writes cell, it won't write tags part at all. But 
 the server expects a tag part, at least a 0 tag length. This tag length read 
 will make a read of some bytes from next cell!
 I suggest we can remove the tag part from CellCodec. This codec is not used 
 by default and I don't think some one will change to CellCodec from the 
 default KVCodec now. ..
 This makes tags not supported via CellCodec..Tag support can be added to 
 CellCodec once we have Connection negotiation in place (?)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HBASE-10322) Strip tags from KV while sending back to client on reads

2014-01-11 Thread Anoop Sam John (JIRA)
Anoop Sam John created HBASE-10322:
--

 Summary: Strip tags from KV while sending back to client on reads
 Key: HBASE-10322
 URL: https://issues.apache.org/jira/browse/HBASE-10322
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Anoop Sam John
Assignee: Anoop Sam John
Priority: Blocker
 Fix For: 0.98.0, 0.99.0


Right now we have some inconsistency wrt sending back tags on read. We do this 
in scan when using Java client(Codec based cell block encoding). But during a 
Get operation or when a pure PB based Scan comes we are not sending back the 
tags.  So any of the below fix we have to do
1. Send back tags in missing cases also. But sending back visibility 
expression/ cell ACL is not correct.
2. Don't send back tags in any case. This will a problem when a tool like 
ExportTool use the scan to export the table data. We will miss exporting the 
cell visibility/ACL.
3. Send back tags based on some condition. It has to be per scan basis. 
Simplest way is pass some kind of attribute in Scan which says whether to send 
back tags or not. But believing some thing what scan specifies might not be 
correct IMO. Then comes the way of checking the user who is doing the scan. 
When a HBase super user doing the scan then only send back tags. So when a case 
comes like Export Tool's the execution should happen from a super user.

So IMO we should go with #3.
Patch coming soon.




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-10227) When a region is opened, its mvcc isn't correctly recovered when there are split hlogs to replay

2014-01-11 Thread Feng Honghua (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-10227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868961#comment-13868961
 ] 

Feng Honghua commented on HBASE-10227:
--

ping [~gustavoanatoly] : any update? :-)

 When a region is opened, its mvcc isn't correctly recovered when there are 
 split hlogs to replay
 

 Key: HBASE-10227
 URL: https://issues.apache.org/jira/browse/HBASE-10227
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Feng Honghua
Assignee: Gustavo Anatoly

 When opening a region, all stores are examined to get the max MemstoreTS and 
 it's used as the initial mvcc for the region, and then split hlogs are 
 replayed. In fact the edits in split hlogs have kvs with greater mvcc than 
 all MemstoreTS in all store files, but replaying them don't increment the 
 mvcc according at all. From an overall perspective this mvcc recovering is 
 'logically' incorrect/incomplete.
 Why currently it doesn't incur problem is because no active scanners exists 
 and no new scanners can be created before the region opening completes, so 
 the mvcc of all kvs in the resulted hfiles from hlog replaying can be safely 
 set to zero. They are just treated as kvs put 'earlier' than the ones in 
 HFiles with mvcc greater than zero(say 'earlier' since they have mvcc less 
 than the ones with non-zero mvcc, but in fact they are put 'later'), and 
 without any incorrect impact just because during region opening there are no 
 active scanners existing / created.
 This bug is just in 'logic' sense for the time being, but if later on we need 
 to survive mvcc in the region's whole logic lifecycle(across regionservers) 
 and never set them to zero, this bug needs to be fixed first.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HBASE-6581) Build with hadoop.profile=3.0

2014-01-11 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868963#comment-13868963
 ] 

Hadoop QA commented on HBASE-6581:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12622528/HBASE-6581-7.patch
  against trunk revision .
  ATTACHMENT ID: 12622528

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop1.0{color}.  The patch compiles against the hadoop 
1.0 profile.

{color:green}+1 hadoop1.1{color}.  The patch compiles against the hadoop 
1.1 profile.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+assembly 
xmlns=http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.1; 
xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation=http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.1
 http://maven.apache.org/xsd/assembly-1.1.1.xsd;
+  !--This 'all' id is not appended to the produced bundle because we do this: 
http://maven.apache.org/plugins/maven-assembly-plugin/faq.html#required-classifiers
 --
+!--This plugin's configuration is used to store Eclipse m2e settings 
only. It has no influence on the Maven build itself.--
+project xmlns=http://maven.apache.org/POM/4.0.0; 
xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation=http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;
+!--This plugin's configuration is used to store Eclipse m2e settings 
only. It has no influence on the Maven build itself.--
+!--This plugin's configuration is used to store Eclipse m2e settings 
only. It has no influence on the Maven build itself.--
+!--This plugin's configuration is used to store Eclipse m2e settings 
only. It has no influence on the Maven build itself.--

{color:red}-1 site{color}.  The patch appears to cause mvn site goal to 
fail.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8389//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8389//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8389//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8389//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8389//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8389//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8389//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8389//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8389//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-thrift.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8389//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/8389//console

This message is automatically generated.

 Build with hadoop.profile=3.0
 -

 Key: HBASE-6581
 URL: https://issues.apache.org/jira/browse/HBASE-6581
 Project: HBase
  Issue Type: Bug
Reporter: Eric Charles
Assignee: Eric Charles
 Fix For: 0.98.1

 Attachments: HBASE-6581-1.patch, HBASE-6581-2.patch, 
 HBASE-6581-20130821.patch, HBASE-6581-3.patch, HBASE-6581-4.patch, 
 HBASE-6581-5.patch, HBASE-6581-6.patch,