date:20111223


 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Open  (was: Patch Available)

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v2.patch, 5064.v3.patch, 
 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5085) fix test-patch script from setting the ulimit


[ 
https://issues.apache.org/jira/browse/HBASE-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175374#comment-13175374
 ] 

nkeywal commented on HBASE-5085:


On prebuild #584, we have:
open files  (-n) 6
max user processes  (-u) 10240
This should be ok, trying immediately.

 fix test-patch script from setting the ulimit
 -

 Key: HBASE-5085
 URL: https://issues.apache.org/jira/browse/HBASE-5085
 Project: HBase
  Issue Type: Bug
Reporter: Giridharan Kesavan
Assignee: Giridharan Kesavan
 Fix For: 0.94.0

 Attachments: 5085-v2-experiment.txt, 5085-v2-experiment.txt, 
 5085-v3-experiment.txt, 5085-v3-experiment.txt, 5085-v4-experiment.txt, 
 5085-v5.txt, hbase-5085.patch


 test-patch.sh script sets the ulimit -n 1024 just after triggering the patch 
 setting this overrides the underlying systems ulimit and hence failing the 
 hbase tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5064) use surefire tests parallelization


 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Attachment: 5064.v7.patch

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v2.patch, 5064.v3.patch, 
 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5088) A concurrency issue on SoftValueSortedMap

2011-12-23 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jieshan Bean updated HBASE-5088:

Attachment: HBase5088Reproduce.java
HBase-5088-trunk.patch
HBase-5088-90.patch

This problem can't be easily reproduced in the cluster. So I just mocked it by
write some test code to reproduce this same concurrency issue. See
HBase5088Reproduce.java, it can be reproduced with a high probability. Change
the TreeMap to ConcurrentSkipListMap can solve the problem.

The patches, I have tested them. All the unit tests passed. In cluster, the
performance seems a slighly coming down:

38764-without patch and 37080-with patch

A concurrency issue on SoftValueSortedMap
-

Key: HBASE-5088
URL: https://issues.apache.org/jira/browse/HBASE-5088
Project: HBase
Issue Type: Bug
Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Jieshan Bean
Attachments: 5088-useMapInterfaces.txt, HBase-5088-90.patch,
HBase-5088-trunk.patch, HBase5088Reproduce.java

SoftValueSortedMap is backed by a TreeMap. All the methods in this class are
synchronized. If we use this method to add/delete elements, it's ok.
But in HConnectionManager#getCachedLocation, it use headMap to get a view
from SoftValueSortedMap#internalMap. Once we operate
on this view map(like add/delete) in other threads, a concurrency issue may
occur.

[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap


[ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175412#comment-13175412
 ] 

ramkrishna.s.vasudevan commented on HBASE-5088:
---

+1 on patch.. 

 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Attachments: 5088-useMapInterfaces.txt, HBase-5088-90.patch, 
 HBase-5088-trunk.patch, HBase5088Reproduce.java


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5088) A concurrency issue on SoftValueSortedMap


 [ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jieshan Bean updated HBASE-5088:


Attachment: (was: HBase-5088-90.patch)

 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Attachments: 5088-useMapInterfaces.txt, HBase-5088-90.patch, 
 HBase-5088-trunk.patch, HBase5088Reproduce.java


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5088) A concurrency issue on SoftValueSortedMap


 [ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jieshan Bean updated HBASE-5088:


Attachment: (was: HBase-5088-trunk.patch)

 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Attachments: 5088-useMapInterfaces.txt, HBase-5088-90.patch, 
 HBase-5088-trunk.patch, HBase5088Reproduce.java


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5088) A concurrency issue on SoftValueSortedMap


 [ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jieshan Bean updated HBASE-5088:


Attachment: HBase-5088-trunk.patch
HBase-5088-90.patch

 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Attachments: 5088-useMapInterfaces.txt, HBase-5088-90.patch, 
 HBase-5088-trunk.patch, HBase5088Reproduce.java


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5064) use surefire tests parallelization


 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Open  (was: Patch Available)

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v2.patch, 5064.v3.patch, 
 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4224) Need a flush by regionserver rather than by table option


[ 
https://issues.apache.org/jira/browse/HBASE-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175461#comment-13175461
 ] 

Zhihong Yu commented on HBASE-4224:
---

Just started looking at patch v2.
For HBaseAdmin.java:
{code}
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
{code}
Dependency on third party jar shouldn't be introduced into client package.

 Need a flush by regionserver rather than by table option
 

 Key: HBASE-4224
 URL: https://issues.apache.org/jira/browse/HBASE-4224
 Project: HBase
  Issue Type: Bug
  Components: shell
Reporter: stack
Assignee: Akash Ashok
 Attachments: HBase-4224-v2.patch, HBase-4224.patch


 This evening needed to clean out logs on the cluster.  logs are by 
 regionserver.  to let go of logs, we need to have all edits emptied from 
 memory.  only flush is by table or region.  We need to be able to flush the 
 regionserver.  Need to add this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5064) use surefire tests parallelization


 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Attachment: 5064.v9.patch

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v2.patch, 5064.v3.patch, 
 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5064) use surefire tests parallelization


 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Patch Available  (was: Open)

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v2.patch, 5064.v3.patch, 
 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5064) use surefire tests parallelization


[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175463#comment-13175463
 ] 

nkeywal commented on HBASE-5064:


#585 (3 processes, min number of threads)
Total time: 40:46.855s
Tests run: 788, Failures: 4, Errors: 1, Skipped: 9

Invalid result expected:134 but was:190
TestCoprocessorEndpoint.testAggregation

NumberFormatException
TestTableMapReduce.testMultiRegionTable
TestHFileOutputFormat.testMRIncrementalLoad
TestHFileOutputFormat.testMRIncrementalLoadWithSplit
TestHFileOutputFormat.testExcludeMinorCompaction

Hung
None

Parallelization went well, nearly 100% linearility, as we're 3 times faster 
with 3 processes.
TestCoprocessorEndpoint.testAggregation failed as well in prebuild #584, so it 
could be unrelated to the //.

Let's try with 6 processes.


 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v2.patch, 5064.v3.patch, 
 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5064) use surefire tests parallelization

[
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175471#comment-13175471
]

Hadoop QA commented on HBASE-5064:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12508546/5064.v9.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 9 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -152 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 77 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.mapred.TestTableMapReduce
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/586//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/586//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/586//console

This message is automatically generated.

use surefire tests parallelization
--

Key: HBASE-5064
URL: https://issues.apache.org/jira/browse/HBASE-5064
Project: HBase
Issue Type: Improvement
Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
Attachments: 5064.patch, 5064.patch, 5064.v2.patch, 5064.v3.patch,
5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch,
5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 5064.v9.patch

To be tried multiple times on hadoop-qa before committing.

[jira] [Updated] (HBASE-5064) use surefire tests parallelization


 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Open  (was: Patch Available)

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v2.patch, 
 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 
 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5064) use surefire tests parallelization


 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Patch Available  (was: Open)

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v2.patch, 
 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 
 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5064) use surefire tests parallelization


 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Attachment: 5064.v10.patch

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v2.patch, 
 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 
 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5064) use surefire tests parallelization


[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175475#comment-13175475
 ] 

Zhihong Yu commented on HBASE-5064:
---

For build #586:
Tests run: 775, Failures: 3, Errors: 1, Skipped: 9

So 13 tests were missing compared to those from build #585

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v2.patch, 
 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 
 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5064) use surefire tests parallelization


[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175477#comment-13175477
 ] 

Zhihong Yu commented on HBASE-5064:
---

For build #586, the following tests had no result:

Hanging test: Running org.apache.hadoop.hbase.master.TestRestartCluster
Hanging test: Running org.apache.hadoop.hbase.regionserver.wal.TestLogRollAbort
Hanging test: Running org.apache.hadoop.hbase.client.TestMetaScanner
Hanging test: Running org.apache.hadoop.hbase.client.TestMultiParallel
Hanging test: Running org.apache.hadoop.hbase.TestGlobalMemStoreSize

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v2.patch, 
 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 
 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5064) use surefire tests parallelization


[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175486#comment-13175486
 ] 

nkeywal commented on HBASE-5064:


#586 (6 processes, min number of threads)
Total time: 22:54.145s
Tests run: 775, Failures: 3, Errors: 1, Skipped: 9

NumberFormatException
TestTableMapReduce.testMultiRegionTable
TestHFileOutputFormat.testMRIncrementalLoad
TestHFileOutputFormat.testMRIncrementalLoadWithSplit
TestHFileOutputFormat.testExcludeMinorCompaction

Hung
master.TestRestartCluster
regionserver.wal.TestLogRollAbort
client.TestMetaScanner
TestGlobalMemStoreSize
client.TestMultiParallel

Still nearly 100% linearility (5 times faster with 6 processes)
A lot of tests hung.
TestCoprocessorEndpoint.testAggregation didn't fail this time.

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v2.patch, 
 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 5064.v8.patch, 
 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap


[ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175492#comment-13175492
 ] 

Lars Hofhansl commented on HBASE-5088:
--

probably because of ConcurrenSkiplistMap... it is definitely slower than a 
plain TreeMap.




 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Attachments: 5088-useMapInterfaces.txt, HBase-5088-90.patch, 
 HBase-5088-trunk.patch, HBase5088Reproduce.java


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5064) use surefire tests parallelization


 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Attachment: 5064.v11.patch

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5064) use surefire tests parallelization


 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Patch Available  (was: Open)

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5064) use surefire tests parallelization


[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175496#comment-13175496
 ] 

nkeywal commented on HBASE-5064:


#587 (7 processes, standard config)
Total time: 31:58.502s
Tests run: 782, Failures: 3, Errors: 1, Skipped: 9

NumberFormatException
TestTableMapReduce.testMultiRegionTable
TestHFileOutputFormat.testMRIncrementalLoad
TestHFileOutputFormat.testMRIncrementalLoadWithSplit
TestHFileOutputFormat.testExcludeMinorCompaction

Hung
replication.TestMasterReplication
master.TestMasterFailover
master.TestRollingRestart

As usual nearly 100% linearility
3 tests hung.
TestCoprocessorEndpoint.testAggregation didn't fail this time.



 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap


[ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175498#comment-13175498
 ] 

Zhihong Yu commented on HBASE-5088:
---

First, I wonder why this problem wasn't discovered earlier.

@Jieshan:
What JDK version is used ? Please also give us the information on OS as well.

Second I want to see the breakdown of slowdown across reads vs writes.

 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Attachments: 5088-useMapInterfaces.txt, HBase-5088-90.patch, 
 HBase-5088-trunk.patch, HBase5088Reproduce.java


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5064) use surefire tests parallelization


 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Open  (was: Patch Available)

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5064) use surefire tests parallelization


[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175515#comment-13175515
 ] 

Zhihong Yu commented on HBASE-5064:
---

For build #588:
{code}
Hanging test: Running 
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase
Hanging test: Running org.apache.hadoop.hbase.catalog.TestMetaReaderEditor
Hanging test: Running 
org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove
Hanging test: Running 
org.apache.hadoop.hbase.replication.regionserver.TestReplicationSink
Hanging test: Running org.apache.hadoop.hbase.replication.TestMasterReplication
{code}

On my MacBook, a lot of tests were not executed after I applied 5064.v11.patch:
{code}
Tests run: 522, Failures: 0, Errors: 0, Skipped: 1

[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 13:49.006s
{code}

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 5064.v6.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 5064.v7.patch, 
 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5064) use surefire tests parallelization


 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Attachment: 5064.v12.patch

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 
 5064.v7.patch, 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5064) use surefire tests parallelization


 [ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-5064:
---

Status: Patch Available  (was: Open)

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 
 5064.v7.patch, 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5064) use surefire tests parallelization


[ 
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175517#comment-13175517
 ] 

nkeywal commented on HBASE-5064:


#588
Total time: 39:50.795s
Tests run: 768, Failures: 3, Errors: 1, Skipped: 9

Hanging test: Running 
org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildBase
Hanging test: Running org.apache.hadoop.hbase.catalog.TestMetaReaderEditor
Hanging test: Running 
org.apache.hadoop.hbase.coprocessor.TestMasterCoprocessorExceptionWithRemove
Hanging test: Running 
org.apache.hadoop.hbase.replication.regionserver.TestReplicationSink
Hanging test: Running org.apache.hadoop.hbase.replication.TestMasterReplication

Gonna try with only 2 threads.

 use surefire tests parallelization
 --

 Key: HBASE-5064
 URL: https://issues.apache.org/jira/browse/HBASE-5064
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
 Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch, 
 5064.v12.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 5064.v5.patch, 
 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v7.patch, 
 5064.v7.patch, 5064.v8.patch, 5064.v9.patch


 To be tried multiple times on hadoop-qa before committing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4224) Need a flush by regionserver rather than by table option


[ 
https://issues.apache.org/jira/browse/HBASE-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175518#comment-13175518
 ] 

Zhihong Yu commented on HBASE-4224:
---

Putting the diff on review board allows me to see the changes easily.
Please submit review request there.

For ServerName.java:
{code}
+   * 2. post  0
{code}
The above should read 'port  0'
{code}
+  public boolean isValid() {
+if(StringUtils.isNotBlank(hostname)  port  0  startcode  0)
+  return true;
+return false;
+  }
{code}
A single return statement should be enough.

For HBaseAdmin.java:
{code}
  private ExecutorService executorService= Executors.newCachedThreadPool();
{code}
Please give ExecutorService better name and add javadoc for it.

 Need a flush by regionserver rather than by table option
 

 Key: HBASE-4224
 URL: https://issues.apache.org/jira/browse/HBASE-4224
 Project: HBase
  Issue Type: Bug
  Components: shell
Reporter: stack
Assignee: Akash Ashok
 Attachments: HBase-4224-v2.patch, HBase-4224.patch


 This evening needed to clean out logs on the cluster.  logs are by 
 regionserver.  to let go of logs, we need to have all edits emptied from 
 memory.  only flush is by table or region.  We need to be able to flush the 
 regionserver.  Need to add this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4224) Need a flush by regionserver rather than by table option


[ 
https://issues.apache.org/jira/browse/HBASE-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175520#comment-13175520
 ] 

Zhihong Yu commented on HBASE-4224:
---

On review board, the white spaces are easy to see.
Please remove them.

 Need a flush by regionserver rather than by table option
 

 Key: HBASE-4224
 URL: https://issues.apache.org/jira/browse/HBASE-4224
 Project: HBase
  Issue Type: Bug
  Components: shell
Reporter: stack
Assignee: Akash Ashok
 Attachments: HBase-4224-v2.patch, HBase-4224.patch


 This evening needed to clean out logs on the cluster.  logs are by 
 regionserver.  to let go of logs, we need to have all edits emptied from 
 memory.  only flush is by table or region.  We need to be able to flush the 
 regionserver.  Need to add this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5064) use surefire tests parallelization

[
https://issues.apache.org/jira/browse/HBASE-5064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175535#comment-13175535
]

Hadoop QA commented on HBASE-5064:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12508551/5064.v12.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 9 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -152 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 77 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.mapred.TestTableMapReduce
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/589//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/589//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/589//console

This message is automatically generated.

use surefire tests parallelization
--

Key: HBASE-5064
URL: https://issues.apache.org/jira/browse/HBASE-5064
Project: HBase
Issue Type: Improvement
Components: test
Affects Versions: 0.94.0
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor
Attachments: 5064.patch, 5064.patch, 5064.v10.patch, 5064.v11.patch,
5064.v12.patch, 5064.v2.patch, 5064.v3.patch, 5064.v4.patch, 5064.v5.patch,
5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v6.patch, 5064.v7.patch,
5064.v7.patch, 5064.v8.patch, 5064.v9.patch

To be tried multiple times on hadoop-qa before committing.

[jira] [Updated] (HBASE-5070) Constraints implementation and javadoc changes

2011-12-23 Thread Jesse Yates (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-5070:
---

Attachment: java_HBASE-5070-v2.patch

Attaching patch with all changes as per RB and latest comments on this ticket. 
Should be good to go, less any more review (though I believe all concerns were 
addressed).

 Constraints implementation and javadoc changes
 --

 Key: HBASE-5070
 URL: https://issues.apache.org/jira/browse/HBASE-5070
 Project: HBase
  Issue Type: Task
Reporter: Zhihong Yu
 Attachments: java_HBASE-5070-v2.patch


 This is continuation of HBASE-4605
 See Stack's comments https://reviews.apache.org/r/2579/#review3980

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5070) Constraints implementation and javadoc changes


[ 
https://issues.apache.org/jira/browse/HBASE-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175547#comment-13175547
 ] 

Hadoop QA commented on HBASE-5070:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12508560/java_HBASE-5070-v2.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/590//console

This message is automatically generated.

 Constraints implementation and javadoc changes
 --

 Key: HBASE-5070
 URL: https://issues.apache.org/jira/browse/HBASE-5070
 Project: HBase
  Issue Type: Task
Reporter: Zhihong Yu
 Attachments: java_HBASE-5070-v2.patch


 This is continuation of HBASE-4605
 See Stack's comments https://reviews.apache.org/r/2579/#review3980

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap

[
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175554#comment-13175554
]

Lars Hofhansl commented on HBASE-5088:
--

I was worrying too why have not discovered that sooner. I assume it just
happens rarely and since it leads to an infinite loop rather than an exception
people might just miss it.

Jieshan, I was wondering whether you do another comparison with
SoftvalueSortedMap replaced by ConcurrentSkiplistMap.

A concurrency issue on SoftValueSortedMap
-

[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap

[
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175565#comment-13175565
]

Lars Hofhansl commented on HBASE-5088:
--

Looking at trunk patch... Looks good.

Since the internalMap (which should really be called delegate, but that is a
different story) is now a ConcurrentNavigableMap, SoftValueSortedMap could
implement ConcurrentNavigableMap and delegate all extra methods to the
internalMap. If that is done then more concrete Map usages in
HConnectionManager could be replaced by interfaces.

A concurrency issue on SoftValueSortedMap
-

[jira] [Updated] (HBASE-4938) Create a HRegion.getScanner public method that allows reading from a specified readPoint


 [ 
https://issues.apache.org/jira/browse/HBASE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4938:
-

Status: Open  (was: Patch Available)

 Create a HRegion.getScanner public method that allows reading from a 
 specified readPoint
 

 Key: HBASE-4938
 URL: https://issues.apache.org/jira/browse/HBASE-4938
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor
 Attachments: scannerMVCC1.txt, scannerMVCC1.txt, scannerMVCC1.txt


 There is an existing api HRegion.getScanner(Scan) that allows scanning a 
 table. My proposal is to extend it to HRegion.getScanner(Scan, long readPoint)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4938) Create a HRegion.getScanner public method that allows reading from a specified readPoint


 [ 
https://issues.apache.org/jira/browse/HBASE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4938:
-

Attachment: scannerMVCC1.txt

Attaching same file again to get a test run.

 Create a HRegion.getScanner public method that allows reading from a 
 specified readPoint
 

 Key: HBASE-4938
 URL: https://issues.apache.org/jira/browse/HBASE-4938
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor
 Attachments: scannerMVCC1.txt, scannerMVCC1.txt, scannerMVCC1.txt


 There is an existing api HRegion.getScanner(Scan) that allows scanning a 
 table. My proposal is to extend it to HRegion.getScanner(Scan, long readPoint)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4938) Create a HRegion.getScanner public method that allows reading from a specified readPoint


 [ 
https://issues.apache.org/jira/browse/HBASE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4938:
-

Status: Patch Available  (was: Open)

 Create a HRegion.getScanner public method that allows reading from a 
 specified readPoint
 

 Key: HBASE-4938
 URL: https://issues.apache.org/jira/browse/HBASE-4938
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor
 Attachments: scannerMVCC1.txt, scannerMVCC1.txt, scannerMVCC1.txt


 There is an existing api HRegion.getScanner(Scan) that allows scanning a 
 table. My proposal is to extend it to HRegion.getScanner(Scan, long readPoint)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5010) Filter HFiles based on TTL

2011-12-23 Thread Phabricator (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175575#comment-13175575
 ] 

Phabricator commented on HBASE-5010:


lhofhansl has commented on the revision [jira] [HBASE-5010] [89-fb] Filter 
HFiles based on TTL.

  lgtm
  See minor comment inline.

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java:832 Rather 
than making this public, should locate the TestClass in same package and this 
package private. That would reduce the change of anybody accidentally using 
it.

REVISION DETAIL
  https://reviews.facebook.net/D909


 Filter HFiles based on TTL
 --

 Key: HBASE-5010
 URL: https://issues.apache.org/jira/browse/HBASE-5010
 Project: HBase
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
 Attachments: D909.1.patch, D909.2.patch


 In ScanWildcardColumnTracker we have
 {code:java}
  
   this.oldestStamp = EnvironmentEdgeManager.currentTimeMillis() - ttl;
   ...
   private boolean isExpired(long timestamp) {
 return timestamp  oldestStamp;
   }
 {code}
 but this time range filtering does not participate in HFile selection. In one 
 real case this caused next() calls to time out because all KVs in a table got 
 expired, but next() had to iterate over the whole table to find that out. We 
 should be able to filter out those HFiles right away. I think a reasonable 
 approach is to add a default timerange filter to every scan for a CF with a 
 finite TTL and utilize existing filtering in 
 StoreFile.Reader.passesTimerangeFilter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5082) ColumnAggregationEndPoint- Causes null pointer in RS when we pass null column qualifier

2011-12-23 Thread Gary Helmling (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175587#comment-13175587
 ] 

Gary Helmling commented on HBASE-5082:
--

ColumnAggregationEndpoint is a test class that was created for use by 
TestClassLoading.  As far as I'm aware, it wasn't intended to be a user-facing 
class or utility.  For that case you would want to use the 
AggregateProtocol/AggregateImplementation coprocessor.

Maybe it would be better to just change ColumnAggregationEndpoint to default 
(package-private) visibility so that it's clear people should not try to make 
use of it?

 ColumnAggregationEndPoint- Causes null pointer in RS when we pass null column 
 qualifier
 ---

 Key: HBASE-5082
 URL: https://issues.apache.org/jira/browse/HBASE-5082
 Project: HBase
  Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Priority: Minor

 I was trying to use the ColumnAggregationEndPoint.sum().
  
 In my sample is just created a column family but did not use any qualifier 
 and inserted some data.
  
 I tried to use  ColumnAggregationEndPoint.sum(qualifier, null).  When i did 
 this inside the ColumnAggregationEndPoint we do 
 scan.addColumn().  This is adding the [null] array in the scan object.  Later 
 in the scanQueryMatcher it is throwing nullpointer exception.  
 I can understand that addColumn() is to specifiy the qualifier.
 Do we need to document somewhere saying qualifier should not be null? I think 
 coprocessors can be used even in places where we don't have qualifiers. If 
 that is the case this sample ColumnAggregationEndPoint may not work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5058) Allow HBaseAdmin to use an existing connection

2011-12-23 Thread Jonathan Hsieh (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Hsieh updated HBASE-5058:
--

Summary: Allow HBaseAdmin to use an existing connection  (was: Allow 
HBaseAmin to use an existing connection)

 Allow HBaseAdmin to use an existing connection
 --

 Key: HBASE-5058
 URL: https://issues.apache.org/jira/browse/HBASE-5058
 Project: HBase
  Issue Type: Sub-task
  Components: client
Affects Versions: 0.94.0
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.94.0

 Attachments: 5058-v2.txt, 5058-v3.txt, 5058-v3.txt, 5058.txt


 What HBASE-4805 does for HTables, this should do for HBaseAdmin.
 Along with this the shared error handling and retrying between HBaseAdmin and 
 HConnectionManager can also be improved. I'll attach a first pass patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5058) Allow HBaseAdmin to use an existing connection


[ 
https://issues.apache.org/jira/browse/HBASE-5058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175602#comment-13175602
 ] 

Lars Hofhansl commented on HBASE-5058:
--

Ah thanks... I cut and pasted the same into the SVN commit log :(

 Allow HBaseAdmin to use an existing connection
 --

 Key: HBASE-5058
 URL: https://issues.apache.org/jira/browse/HBASE-5058
 Project: HBase
  Issue Type: Sub-task
  Components: client
Affects Versions: 0.94.0
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor
 Fix For: 0.94.0

 Attachments: 5058-v2.txt, 5058-v3.txt, 5058-v3.txt, 5058.txt


 What HBASE-4805 does for HTables, this should do for HBaseAdmin.
 Along with this the shared error handling and retrying between HBaseAdmin and 
 HConnectionManager can also be improved. I'll attach a first pass patch soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4938) Create a HRegion.getScanner public method that allows reading from a specified readPoint

[
https://issues.apache.org/jira/browse/HBASE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175603#comment-13175603
]

Hadoop QA commented on HBASE-4938:
--

-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12508563/scannerMVCC1.txt
against trunk revision .

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified
tests.
Please justify why no new tests are needed for this
patch.
Also please list what manual steps were performed to
verify this patch.

-1 javadoc. The javadoc tool appears to have generated -152 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 77 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.mapred.TestTableMapReduce
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/591//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/591//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/591//console

This message is automatically generated.

Create a HRegion.getScanner public method that allows reading from a
specified readPoint

Key: HBASE-4938
URL: https://issues.apache.org/jira/browse/HBASE-4938
Project: HBase
Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor
Attachments: scannerMVCC1.txt, scannerMVCC1.txt, scannerMVCC1.txt

There is an existing api HRegion.getScanner(Scan) that allows scanning a
table. My proposal is to extend it to HRegion.getScanner(Scan, long readPoint)

[jira] [Commented] (HBASE-4938) Create a HRegion.getScanner public method that allows reading from a specified readPoint


[ 
https://issues.apache.org/jira/browse/HBASE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175604#comment-13175604
 ] 

Lars Hofhansl commented on HBASE-4938:
--

So that looks pretty good. TestTableMapReduce and TestHFileOutputFormat are 
flaky.

 Create a HRegion.getScanner public method that allows reading from a 
 specified readPoint
 

 Key: HBASE-4938
 URL: https://issues.apache.org/jira/browse/HBASE-4938
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor
 Attachments: scannerMVCC1.txt, scannerMVCC1.txt, scannerMVCC1.txt


 There is an existing api HRegion.getScanner(Scan) that allows scanning a 
 table. My proposal is to extend it to HRegion.getScanner(Scan, long readPoint)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)


 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4218:
--

Attachment: Data-block-encoding.patch-2011-12-23

Re-attaching for Hadoop QA test

 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, 
 D447.12.patch, D447.13.patch, D447.2.patch, D447.3.patch, D447.4.patch, 
 D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding.patch-2011-12-23, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)


 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4218:
--

Status: Open  (was: Patch Available)

 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, 
 D447.12.patch, D447.13.patch, D447.2.patch, D447.3.patch, D447.4.patch, 
 D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding.patch-2011-12-23, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)


 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4218:
--

Status: Patch Available  (was: Open)

 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, 
 D447.12.patch, D447.13.patch, D447.2.patch, D447.3.patch, D447.4.patch, 
 D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding.patch-2011-12-23, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)


 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4218:
--

Attachment: (was: Data-block-encoding.patch-2011-12-23)

 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, 
 D447.12.patch, D447.13.patch, D447.2.patch, D447.3.patch, D447.4.patch, 
 D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding-2011-12-23.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)


 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4218:
--

Attachment: Data-block-encoding-2011-12-23.patch

 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, 
 D447.12.patch, D447.13.patch, D447.2.patch, D447.3.patch, D447.4.patch, 
 D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding-2011-12-23.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)


 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4218:
--

Status: Open  (was: Patch Available)

 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, 
 D447.12.patch, D447.13.patch, D447.2.patch, D447.3.patch, D447.4.patch, 
 D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding-2011-12-23.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4218) Delta Encoding of KeyValues (aka prefix compression)


 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4218:
--

Status: Patch Available  (was: Open)

 Delta Encoding of KeyValues  (aka prefix compression)
 -

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, 
 D447.12.patch, D447.13.patch, D447.2.patch, D447.3.patch, D447.4.patch, 
 D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding-2011-12-23.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)


 [ 
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4218:
--

Fix Version/s: 0.94.0
  Summary: Data Block Encoding of KeyValues  (aka delta encoding / 
prefix compression)  (was: Delta Encoding of KeyValues  (aka prefix 
compression))

 Data Block Encoding of KeyValues  (aka delta encoding / prefix compression)
 ---

 Key: HBASE-4218
 URL: https://issues.apache.org/jira/browse/HBASE-4218
 Project: HBase
  Issue Type: Improvement
  Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
  Labels: compression
 Fix For: 0.94.0

 Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch, 
 0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch, 
 D447.12.patch, D447.13.patch, D447.2.patch, D447.3.patch, D447.4.patch, 
 D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch, 
 Data-block-encoding-2011-12-23.patch, 
 Delta-encoding.patch-2011-12-22_11_52_07.patch, 
 Delta_encoding_with_memstore_TS.patch, open-source.diff


 A compression for keys. Keys are sorted in HFile and they are usually very 
 similar. Because of that, it is possible to design better compression than 
 general purpose algorithms,
 It is an additional step designed to be used in memory. It aims to save 
 memory in cache as well as speeding seeks within HFileBlocks. It should 
 improve performance a lot, if key lengths are larger than value lengths. For 
 example, it makes a lot of sense to use it when value is a counter.
 Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes) 
 shows that I could achieve decent level of compression:
  key compression ratio: 92%
  total compression ratio: 85%
  LZO on the same data: 85%
  LZO after delta encoding: 91%
 While having much better performance (20-80% faster decompression ratio than 
 LZO). Moreover, it should allow far more efficient seeking which should 
 improve performance a bit.
 It seems that a simple compression algorithms are good enough. Most of the 
 savings are due to prefix compression, int128 encoding, timestamp diffs and 
 bitfields to avoid duplication. That way, comparisons of compressed data can 
 be much faster than a byte comparator (thanks to prefix compression and 
 bitfields).
 In order to implement it in HBase two important changes in design will be 
 needed:
 -solidify interface to HFileBlock / HFileReader Scanner to provide seeking 
 and iterating; access to uncompressed buffer in HFileBlock will have bad 
 performance
 -extend comparators to support comparison assuming that N first bytes are 
 equal (or some fields are equal)
 Link to a discussion about something similar:
 http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4938) Create a HRegion.getScanner public method that allows reading from a specified readPoint


 [ 
https://issues.apache.org/jira/browse/HBASE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-4938:
--

Fix Version/s: 0.94.0
 Hadoop Flags: Reviewed

Integrated to TRUNK.

Thanks for the patch Dhruba.

Thanks for the review Lars and Todd.

 Create a HRegion.getScanner public method that allows reading from a 
 specified readPoint
 

 Key: HBASE-4938
 URL: https://issues.apache.org/jira/browse/HBASE-4938
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor
 Fix For: 0.94.0

 Attachments: scannerMVCC1.txt, scannerMVCC1.txt, scannerMVCC1.txt


 There is an existing api HRegion.getScanner(Scan) that allows scanning a 
 table. My proposal is to extend it to HRegion.getScanner(Scan, long readPoint)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-5088) A concurrency issue on SoftValueSortedMap

2011-12-23 Thread Lars Hofhansl (Issue Comment Edited) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5088:
-

Attachment: 5088.generics.txt

Here's a version that does that.
o SoftValueSortedMap implements NavigableMap
o All extra NavigableMap are delegated to internalMap
o None of its methods are synchronized
o All generics warnings are fixed.

Please have a look.

 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Attachments: 5088-useMapInterfaces.txt, 5088.generics.txt, 
 HBase-5088-90.patch, HBase-5088-trunk.patch, HBase5088Reproduce.java


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (HBASE-5088) A concurrency issue on SoftValueSortedMap

[
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175632#comment-13175632
]

Lars Hofhansl edited comment on HBASE-5088 at 12/24/11 12:58 AM:
-

Here's a version that does that.
o SoftValueSortedMap implements NavigableMap
o All extra NavigableMap methods are delegated to internalMap
o None of its methods are synchronized
o All generics warnings are fixed.

Please have a look.

was (Author: lhofhansl):
Here's a version that does that.
o SoftValueSortedMap implements NavigableMap
o All extra NavigableMap are delegated to internalMap
o None of its methods are synchronized
o All generics warnings are fixed.

Please have a look.

A concurrency issue on SoftValueSortedMap
-

Key: HBASE-5088
URL: https://issues.apache.org/jira/browse/HBASE-5088
Project: HBase
Issue Type: Bug
Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Jieshan Bean
Attachments: 5088-useMapInterfaces.txt, 5088.generics.txt,
HBase-5088-90.patch, HBase-5088-trunk.patch, HBase5088Reproduce.java

[jira] [Commented] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)

2011-12-23 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175633#comment-13175633
]

Hadoop QA commented on HBASE-4218:
--

-1 overall. Here are the results of testing the latest attachment

http://issues.apache.org/jira/secure/attachment/12508576/Data-block-encoding-2011-12-23.patch
against trunk revision .

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 92 new or modified tests.

-1 javadoc. The javadoc tool appears to have generated -142 warning
messages.

+1 javac. The applied patch does not increase the total number of javac
compiler warnings.

-1 findbugs. The patch appears to introduce 81 new Findbugs (version
1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of
release audit warnings.

-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.mapred.TestTableMapReduce
org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/592//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/592//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/592//console

This message is automatically generated.

Data Block Encoding of KeyValues (aka delta encoding / prefix compression)
---

Key: HBASE-4218
URL: https://issues.apache.org/jira/browse/HBASE-4218
Project: HBase
Issue Type: Improvement
Components: io
Affects Versions: 0.94.0
Reporter: Jacek Migdal
Assignee: Mikhail Bautin
Labels: compression
Fix For: 0.94.0

Attachments: 0001-Delta-encoding-fixed-encoded-scanners.patch,
0001-Delta-encoding.patch, D447.1.patch, D447.10.patch, D447.11.patch,
D447.12.patch, D447.13.patch, D447.2.patch, D447.3.patch, D447.4.patch,
D447.5.patch, D447.6.patch, D447.7.patch, D447.8.patch, D447.9.patch,
Data-block-encoding-2011-12-23.patch,
Delta-encoding.patch-2011-12-22_11_52_07.patch,
Delta_encoding_with_memstore_TS.patch, open-source.diff

A compression for keys. Keys are sorted in HFile and they are usually very
similar. Because of that, it is possible to design better compression than
general purpose algorithms,
It is an additional step designed to be used in memory. It aims to save
memory in cache as well as speeding seeks within HFileBlocks. It should
improve performance a lot, if key lengths are larger than value lengths. For
example, it makes a lot of sense to use it when value is a counter.
Initial tests on real data (key length = ~ 90 bytes , value length = 8 bytes)
shows that I could achieve decent level of compression:
key compression ratio: 92%
total compression ratio: 85%
LZO on the same data: 85%
LZO after delta encoding: 91%
While having much better performance (20-80% faster decompression ratio than
LZO). Moreover, it should allow far more efficient seeking which should
improve performance a bit.
It seems that a simple compression algorithms are good enough. Most of the
savings are due to prefix compression, int128 encoding, timestamp diffs and
bitfields to avoid duplication. That way, comparisons of compressed data can
be much faster than a byte comparator (thanks to prefix compression and
bitfields).
In order to implement it in HBase two important changes in design will be
needed:
-solidify interface to HFileBlock / HFileReader Scanner to provide seeking
and iterating; access to uncompressed buffer in HFileBlock will have bad
performance
-extend comparators to support comparison assuming that N first bytes are
equal (or some fields are equal)
Link to a discussion about something similar:
http://search-hadoop.com/m/5aqGXJEnaD1/hbase+windowssubj=Re+prefix+compression

[jira] [Commented] (HBASE-5088) A concurrency issue on SoftValueSortedMap


[ 
https://issues.apache.org/jira/browse/HBASE-5088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175640#comment-13175640
 ] 

ramkrishna.s.vasudevan commented on HBASE-5088:
---

@Lars 

Had a look at the patch. The intention is good.  

 A concurrency issue on SoftValueSortedMap
 -

 Key: HBASE-5088
 URL: https://issues.apache.org/jira/browse/HBASE-5088
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.4, 0.94.0
Reporter: Jieshan Bean
Assignee: Jieshan Bean
 Attachments: 5088-useMapInterfaces.txt, 5088.generics.txt, 
 HBase-5088-90.patch, HBase-5088-trunk.patch, HBase5088Reproduce.java


 SoftValueSortedMap is backed by a TreeMap. All the methods in this class are 
 synchronized. If we use this method to add/delete elements, it's ok.
 But in HConnectionManager#getCachedLocation, it use headMap to get a view 
 from SoftValueSortedMap#internalMap. Once we operate 
 on this view map(like add/delete) in other threads, a concurrency issue may 
 occur.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4938) Create a HRegion.getScanner public method that allows reading from a specified readPoint

2011-12-23 Thread Hudson (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175642#comment-13175642
 ] 

Hudson commented on HBASE-4938:
---

Integrated in HBase-TRUNK #2571 (See 
[https://builds.apache.org/job/HBase-TRUNK/2571/])
HBASE-4938 Create a HRegion.getScanner public method that allows reading 
from a specified readPoint (Dhruba)

tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/IsolationLevel.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java


 Create a HRegion.getScanner public method that allows reading from a 
 specified readPoint
 

 Key: HBASE-4938
 URL: https://issues.apache.org/jira/browse/HBASE-4938
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor
 Fix For: 0.94.0

 Attachments: scannerMVCC1.txt, scannerMVCC1.txt, scannerMVCC1.txt


 There is an existing api HRegion.getScanner(Scan) that allows scanning a 
 table. My proposal is to extend it to HRegion.getScanner(Scan, long readPoint)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-4862) Splitting hlog and opening region concurrently may cause data loss

2011-12-23 Thread ramkrishna.s.vasudevan (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-4862:
--

Fix Version/s: 0.92.0

 Splitting hlog and opening region concurrently may cause data loss
 --

 Key: HBASE-4862
 URL: https://issues.apache.org/jira/browse/HBASE-4862
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.2
Reporter: chunhui shen
Assignee: chunhui shen
Priority: Critical
 Fix For: 0.92.0, 0.90.5

 Attachments: 4862-0.92.txt, 4862-v6-90.txt, 4862-v6-trunk.patch, 
 4862.patch, 4862.txt, hbase-4862v1 for 0.90.diff, hbase-4862v1 for 0.90.diff, 
 hbase-4862v1 for trunk.diff, hbase-4862v1 for trunk.diff, 
 hbase-4862v2for0.90.diff, hbase-4862v2fortrunk.diff, 
 hbase-4862v3for0.90.diff, hbase-4862v3fortrunk.diff, 
 hbase-4862v5for0.90.diff, hbase-4862v5fortrunk.diff, 
 hbase-4862v7for0.90.patch, hbase-4862v7fortrunk.patch


 Case Description:
 1.Split hlog thread creat writer for the file region A/recoverd.edits/123456 
 and is appending log entry
 2.Regionserver is opening region A now, and in the process 
 replayRecoveredEditsIfAny() ,it will delete the file region 
 A/recoverd.edits/123456 
 3.Split hlog thread catches the io exception, and stop parse this log file 
 and if skipError = true , add it to the corrupt logsHowever, data in 
 other regions in this log file will loss 
 4.Or if skipError = false, it will check filesystem.Of course, the file 
 system is ok , and it only prints a error log, continue assigning regions. 
 Therefore, data in other log files will also loss!!
 The case may happen in the following:
 1.Move region from server A to server B
 2.kill server A and Server B
 3.restart server A and Server B
 We could prevent this exception throuth forbiding deleting  recover.edits 
 file 
 which is appending by split hlog thread

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5082) ColumnAggregationEndPoint- Causes null pointer in RS when we pass null column qualifier

2011-12-23 Thread ramkrishna.s.vasudevan (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175645#comment-13175645
]

ramkrishna.s.vasudevan commented on HBASE-5082:
---

Ok... Thanks Gary

ColumnAggregationEndPoint- Causes null pointer in RS when we pass null column
qualifier
---

Key: HBASE-5082
URL: https://issues.apache.org/jira/browse/HBASE-5082
Project: HBase
Issue Type: Bug
Reporter: ramkrishna.s.vasudevan
Priority: Minor

I was trying to use the ColumnAggregationEndPoint.sum().

In my sample is just created a column family but did not use any qualifier
and inserted some data.

I tried to use ColumnAggregationEndPoint.sum(qualifier, null). When i did
this inside the ColumnAggregationEndPoint we do
scan.addColumn(). This is adding the [null] array in the scan object. Later
in the scanQueryMatcher it is throwing nullpointer exception.
I can understand that addColumn() is to specifiy the qualifier.
Do we need to document somewhere saying qualifier should not be null? I think
coprocessors can be used even in places where we don't have qualifiers. If
that is the case this sample ColumnAggregationEndPoint may not work.

[jira] [Commented] (HBASE-4224) Need a flush by regionserver rather than by table option

2011-12-23 Thread jirapos...@reviews.apache.org (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175649#comment-13175649
 ] 

Zhihong Yu commented on HBASE-4224:
---

com.google.common.collect isn't Apache Commons. No dependency on additional 
third party jar should be introduced.

 Need a flush by regionserver rather than by table option
 

 Key: HBASE-4224
 URL: https://issues.apache.org/jira/browse/HBASE-4224
 Project: HBase
  Issue Type: Bug
  Components: shell
Reporter: stack
Assignee: Akash Ashok
 Attachments: HBase-4224-v2.patch, HBase-4224.patch


 This evening needed to clean out logs on the cluster.  logs are by 
 regionserver.  to let go of logs, we need to have all edits emptied from 
 memory.  only flush is by table or region.  We need to be able to flush the 
 regionserver.  Need to add this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4224) Need a flush by regionserver rather than by table option

2011-12-23 Thread Akash Ashok (Commented) (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175655#comment-13175655
]

Akash Ashok commented on HBASE-4224:

Com.google.common.collect is guava I know, but what I meant to say is that
ApacheCommons itself is a third party jar with respect to HBase. Is there any
particular reason why we shouldn't add additional third party dependencies
other than the fact that its a client side code ?

Let me give an example:
Assume we are using HBaseAdmin in our code to create a table. Now to run this
code we need to add dependency on HBase*.jar. Now since there's a dependency on
apache commons logging so this additional step of adding this into your
classpath is anyways already there, along with which u need to add one more
dependency.

Here since it's its basic usage of guava I could remove the dependency. But
could you help me understand as to why this is to be done ?

Need a flush by regionserver rather than by table option

Key: HBASE-4224
URL: https://issues.apache.org/jira/browse/HBASE-4224
Project: HBase
Issue Type: Bug
Components: shell
Reporter: stack
Assignee: Akash Ashok
Attachments: HBase-4224-v2.patch, HBase-4224.patch

This evening needed to clean out logs on the cluster. logs are by
regionserver. to let go of logs, we need to have all edits emptied from
memory. only flush is by table or region. We need to be able to flush the
regionserver. Need to add this.

[jira] [Commented] (HBASE-4224) Need a flush by regionserver rather than by table option


[ 
https://issues.apache.org/jira/browse/HBASE-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175657#comment-13175657
 ] 

jirapos...@reviews.apache.org commented on HBASE-4224:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3308/
---

Review request for hbase.


Summary
---

Flush by RegionServer


This addresses bug HBase-4224.
https://issues.apache.org/jira/browse/HBase-4224


Diffs
-

  /src/main/java/org/apache/hadoop/hbase/ServerName.java 1222902 
  /src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 1222902 
  /src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 1222902 
  /src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 
1222902 

Diff: https://reviews.apache.org/r/3308/diff


Testing
---


Thanks,

Akash



 Need a flush by regionserver rather than by table option
 

 Key: HBASE-4224
 URL: https://issues.apache.org/jira/browse/HBASE-4224
 Project: HBase
  Issue Type: Bug
  Components: shell
Reporter: stack
Assignee: Akash Ashok
 Attachments: HBase-4224-v2.patch, HBase-4224.patch


 This evening needed to clean out logs on the cluster.  logs are by 
 regionserver.  to let go of logs, we need to have all edits emptied from 
 memory.  only flush is by table or region.  We need to be able to flush the 
 regionserver.  Need to add this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4224) Need a flush by regionserver rather than by table option

2011-12-23 Thread Akash Ashok (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175658#comment-13175658
 ] 

Akash Ashok commented on HBASE-4224:


Have subbmitted on the review board. https://reviews.apache.org/r/3308/.
Thanks

 Need a flush by regionserver rather than by table option
 

 Key: HBASE-4224
 URL: https://issues.apache.org/jira/browse/HBASE-4224
 Project: HBase
  Issue Type: Bug
  Components: shell
Reporter: stack
Assignee: Akash Ashok
 Attachments: HBase-4224-v2.patch, HBase-4224.patch


 This evening needed to clean out logs on the cluster.  logs are by 
 regionserver.  to let go of logs, we need to have all edits emptied from 
 memory.  only flush is by table or region.  We need to be able to flush the 
 regionserver.  Need to add this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HBASE-4529) Make WAL Pluggable

2011-12-23 Thread Akash Ashok (Assigned) (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash Ashok reassigned HBASE-4529:
--

Assignee: Akash Ashok

 Make WAL Pluggable 
 ---

 Key: HBASE-4529
 URL: https://issues.apache.org/jira/browse/HBASE-4529
 Project: HBase
  Issue Type: Improvement
  Components: regionserver, wal
Reporter: Akash Ashok
Assignee: Akash Ashok
  Labels: regionserver, wal

 Make WAL a pluggable, configurable component, thus making it easier to write 
 to different filesystems (including multiple filesystems).
 From Stack:
 Pluggable WAL component would need to check that the split can deal w/
 multiple logs written by the one server concurrently (sort by sequence
 edit id after sorting on all the rest that makes up a wal log key).
 From Jesse Yates:
 It would be nice to be able to tie pluggable WAL component into a service 
 that logs directly to
 disk, rather than go through HDFS giving some potentially awesome speedup at
 the cost of having to write a logging service that handles replication, etc.
 From Karthik Tunga:
 Along with the log replaying part, logic is also needed for log roll.
 This, I think, is easier compared to the merging of the logs. Any edits less
 than the last sequence number on the file system can be removed from all
 the WALs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4938) Create a HRegion.getScanner public method that allows reading from a specified readPoint

2011-12-23 Thread Hudson (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175669#comment-13175669
 ] 

Hudson commented on HBASE-4938:
---

Integrated in HBase-TRUNK-security #45 (See 
[https://builds.apache.org/job/HBase-TRUNK-security/45/])
HBASE-4938 Create a HRegion.getScanner public method that allows reading 
from a specified readPoint (Dhruba)

tedyu : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/IsolationLevel.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java


 Create a HRegion.getScanner public method that allows reading from a 
 specified readPoint
 

 Key: HBASE-4938
 URL: https://issues.apache.org/jira/browse/HBASE-4938
 Project: HBase
  Issue Type: Improvement
Reporter: dhruba borthakur
Assignee: dhruba borthakur
Priority: Minor
 Fix For: 0.94.0

 Attachments: scannerMVCC1.txt, scannerMVCC1.txt, scannerMVCC1.txt


 There is an existing api HRegion.getScanner(Scan) that allows scanning a 
 table. My proposal is to extend it to HRegion.getScanner(Scan, long readPoint)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4218) Data Block Encoding of KeyValues (aka delta encoding / prefix compression)