[jira] [Commented] (HBASE-11777) Find a way to use KV.setSequenceId() on Cells on the server-side read path
[ https://issues.apache.org/jira/browse/HBASE-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101901#comment-14101901 ] ramkrishna.s.vasudevan commented on HBASE-11777: bq.Server side read path to pass new interface type than Cell type Yes. Either we should change every where to MutableCell(the new impl) on the read path Or is it enough if we just do an instance of the new cell type where ever needed and set the setSeqId on to that? Find a way to use KV.setSequenceId() on Cells on the server-side read path -- Key: HBASE-11777 URL: https://issues.apache.org/jira/browse/HBASE-11777 Project: HBase Issue Type: Improvement Affects Versions: 0.99.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Over in HBASE-11591 there was a need to set the sequenceId of the HFile to the bulk loaded KVs. Since we are trying to use the concept of Cells in the read path if we need to use setSequenceId(), then the Cell has to be converted to KV and only KeyValue impl has the accessor setSequenceId(). [~anoop.hbase] suggested if we can use a Server side impl of Cell and have these accessors in them. This JIRA aims to solve this and see the related code changes that needs to be carried out for this. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11772) Bulk load mvcc and seqId issues with native hfiles
[ https://issues.apache.org/jira/browse/HBASE-11772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101904#comment-14101904 ] ramkrishna.s.vasudevan commented on HBASE-11772: In 0.98 TestBulkLoad should be working fine because {code} if (kv.getMvccVersion() = smallestReadPoint) { kv.setMvccVersion(0); } {code} We set the kv's mvcc version to 0 and so the latest kv from the bulk loaded file should be returned back. Let me check once again. Bulk load mvcc and seqId issues with native hfiles -- Key: HBASE-11772 URL: https://issues.apache.org/jira/browse/HBASE-11772 Project: HBase Issue Type: Bug Affects Versions: 0.98.5 Reporter: Jerry He Assignee: Jerry He Priority: Critical Fix For: 0.98.6 Attachments: HBASE-11772-0.98.patch There are mvcc and seqId issues when bulk load native hfiles -- meaning hfiles that are direct file copy-out from hbase, not from HFileOutputFormat job. There are differences between these two types of hfiles. Native hfiles have possible non-zero MAX_MEMSTORE_TS_KEY value and non-zero mvcc values in cells. Native hfiles also have MAX_SEQ_ID_KEY. Native hfiles do not have BULKLOAD_TIME_KEY. Here are a couple of problems I observed when bulk load native hfiles. 1. Cells in newly bulk loaded hfiles can be invisible to scan. It is easy to re-create. Bulk load a native hfile that has a larger mvcc value in cells, e.g 10 If the current readpoint when initiating a scan is less than 10, the cells in the new hfile are skipped, thus become invisible. We don't reset the readpoint of a region after bulk load. 2. The current StoreFile.isBulkLoadResult() is implemented as: {code} return metadataMap.containsKey(BULKLOAD_TIME_KEY) {code} which does not detect bulkloaded native hfiles. 3. Another observed problem is possible data loss during log recovery. It is similar to HBASE-10958 reported by [~jdcryans]. Borrow the re-create steps from HBASE-10958. 1) Create an empty table 2) Put one row in it (let's say it gets seqid 1) 3) Bulk load one native hfile with large seqId ( e.g. 100). The native hfile can be obtained by copying out from existing table. 4) Kill the region server that holds the table's region. Scan the table once the region is made available again. The first row, at seqid 1, will be missing since the HFile with seqid 100 makes us believe that everything that came before it was flushed. The problem 3 is probably related to 2. We will be ok if we get the appended seqId during bulk load instead of 100 from inside the file. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-4368) Expose processlist in shell (per regionserver and perhaps by cluster)
[ https://issues.apache.org/jira/browse/HBASE-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101903#comment-14101903 ] Shahin Saneinejad commented on HBASE-4368: -- [~busbey]: Nope, I'm no longer working with HBase. Thanks for the heads up. Expose processlist in shell (per regionserver and perhaps by cluster) - Key: HBASE-4368 URL: https://issues.apache.org/jira/browse/HBASE-4368 Project: HBase Issue Type: Task Components: shell Reporter: stack Labels: beginner Attachments: HBASE-4368.patch HBASE-4057 adds processlist and it shows in the RS UI. This issue is about getting the processlist to show in the shell, like it does in mysql. Labelling it noob; this is a pretty substantial issue but it shouldn't be too hard -- it'd mostly be plumbing from RS into the shell. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11776) All RegionServers crash when compact when setting TTL
[ https://issues.apache.org/jira/browse/HBASE-11776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101922#comment-14101922 ] wuchengzhi commented on HBASE-11776: [~ted_yu] [~anoop.hbase] yet, it's the same problems,i just testing in 0.98.5 right now,it seems have fixed. All RegionServers crash when compact when setting TTL - Key: HBASE-11776 URL: https://issues.apache.org/jira/browse/HBASE-11776 Project: HBase Issue Type: Bug Components: Compaction Affects Versions: 0.96.1.1 Environment: ubuntu 12.04 jdk1.7.0_06 Reporter: wuchengzhi Priority: Critical Original Estimate: 72h Remaining Estimate: 72h As We create the table with TTL in columnFamily, When files was selected to compact and the files's KVs all expired, after this, it generate a file just contains some meta-info such as trail,but without kvs(size:564bytes). (and the storeFile.getReader().getMaxTimestamp() = -1) And then We put the data to this table so fast, so memStore will flush to storefile, and cause the compact task,unexpected thing happens: the storefiles's count keeps on increasing all the time. seeing the debug log : {code:title=hbase-regionServer.log|borderStyle=solid} 2014-08-17 15:41:02,689 DEBUG [regionserver60020-smallCompactions-1408258247722] regionserver.CompactSplitThread: CompactSplitThread Status: compaction_queue=(0:1), split_queue=0, merge_queue=0 2014-08-17 15:41:02,689 DEBUG [regionserver60020-smallCompactions-1408258247722] compactions.RatioBasedCompactionPolicy: Selecting compaction from 9 store files, 0 compacting, 9 eligible, 10 blocking 2014-08-17 15:41:02,689 INFO [regionserver60020-smallCompactions-1408258247722] compactions.RatioBasedCompactionPolicy: Deleting the expired store file by compaction: hdfs://hbase:9000/hbase/data/default/top_subchannel_2/0b47596c0bff1a60cf749cf1101eb642/s/c6392d54411a46cbb19350d706a298be whose maxTimeStamp is -1 while the max expired timestamp is 1408257662689 2014-08-17 15:41:02,689 DEBUG [regionserver60020-smallCompactions-1408258247722] regionserver.HStore: 0b47596c0bff1a60cf749cf1101eb642 - s: Initiating minor compaction 2014-08-17 15:41:02,689 INFO [regionserver60020-smallCompactions-1408258247722] regionserver.HRegion: Starting compaction on s in region top_subchannel_2,,1407982287422.0b47596c0bff1a60cf749cf1101eb642. 2014-08-17 15:41:02,689 INFO [regionserver60020-smallCompactions-1408258247722] regionserver.HStore: Starting compaction of 1 file(s) in s of top_subchannel_2,,1407982287422.0b47596c0bff1a60cf749cf1101eb642. into tmpdir=hdfs://hbase:9000/hbase/data/default/top_subchannel_2/0b47596c0bff1a60cf749cf1101eb642/.tmp, totalSize=564 2014-08-17 15:41:02,689 DEBUG [regionserver60020-smallCompactions-1408258247722] compactions.Compactor: Compacting hdfs://hbase:9000/hbase/data/default/top_subchannel_2/0b47596c0bff1a60cf749cf1101eb642/s/c6392d54411a46cbb19350d706a298be, keycount=0, bloomtype=NONE, size=564, encoding=FAST_DIFF, seqNum=45561 2014-08-17 15:41:02,711 INFO [regionserver60020-smallCompactions-1408258247722] regionserver.StoreFile: HFile Bloom filter type for f2e60ae4574a4d6eb89745d43582e9b4: NONE, but ROW specified in column family configuration 2014-08-17 15:41:02,713 DEBUG [regionserver60020-smallCompactions-1408258247722] regionserver.HRegionFileSystem: Committing store file hdfs://hbase:9000/hbase/data/default/top_subchannel_2/0b47596c0bff1a60cf749cf1101eb642/.tmp/f2e60ae4574a4d6eb89745d43582e9b4 as hdfs://hbase:9000/hbase/data/default/top_subchannel_2/0b47596c0bff1a60cf749cf1101eb642/s/f2e60ae4574a4d6eb89745d43582e9b4 2014-08-17 15:41:02,726 INFO [regionserver60020-smallCompactions-1408258247722] regionserver.StoreFile: HFile Bloom filter type for f2e60ae4574a4d6eb89745d43582e9b4: NONE, but ROW specified in column family configuration 2014-08-17 15:41:02,727 DEBUG [regionserver60020-smallCompactions-1408258247722] regionserver.HStore: Removing store files after compaction... 2014-08-17 15:41:02,731 DEBUG [regionserver60020-smallCompactions-1408258247722] backup.HFileArchiver: Finished archiving from class org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, file:hdfs://hbase:9000/hbase/data/default/top_subchannel_2/0b47596c0bff1a60cf749cf1101eb642/s/c6392d54411a46cbb19350d706a298be, to hdfs://hbase:9000/hbase/archive/data/default/top_subchannel_2/0b47596c0bff1a60cf749cf1101eb642/s/c6392d54411a46cbb19350d706a298be 2014-08-17 15:41:02,731 INFO [regionserver60020-smallCompactions-1408258247722] regionserver.HStore: Completed compaction of 1 file(s) in s of top_subchannel_2,,1407982287422.0b47596c0bff1a60cf749cf1101eb642. into f2e60ae4574a4d6eb89745d43582e9b4(size=564), total
[jira] [Updated] (HBASE-11645) Snapshot for MOB
[ https://issues.apache.org/jira/browse/HBASE-11645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-11645: --- Description: Add snapshot support for MOB. In the initial implementation, taking a table snapshot does not preserve the mob data. This issue will make sure that when a snapshot is taken, mob data is properly preserved and is restorable. (was: Add snapshot support for MOB. In the initial implementation, taking a table snapshot does not preserve the mob data. This issue will make sure that when a snapshot is taken, mob data is properly preserved and is restorable.) Snapshot for MOB Key: HBASE-11645 URL: https://issues.apache.org/jira/browse/HBASE-11645 Project: HBase Issue Type: Sub-task Components: snapshots Reporter: Jingcheng Du Assignee: Jingcheng Du Attachments: HBASE-11645-V2.diff, HBASE-11645.diff Add snapshot support for MOB. In the initial implementation, taking a table snapshot does not preserve the mob data. This issue will make sure that when a snapshot is taken, mob data is properly preserved and is restorable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HBASE-11776) All RegionServers crash when compact when setting TTL
[ https://issues.apache.org/jira/browse/HBASE-11776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuchengzhi resolved HBASE-11776. Resolution: Duplicate duplicate issue All RegionServers crash when compact when setting TTL - Key: HBASE-11776 URL: https://issues.apache.org/jira/browse/HBASE-11776 Project: HBase Issue Type: Bug Components: Compaction Affects Versions: 0.96.1.1 Environment: ubuntu 12.04 jdk1.7.0_06 Reporter: wuchengzhi Priority: Critical Original Estimate: 72h Remaining Estimate: 72h As We create the table with TTL in columnFamily, When files was selected to compact and the files's KVs all expired, after this, it generate a file just contains some meta-info such as trail,but without kvs(size:564bytes). (and the storeFile.getReader().getMaxTimestamp() = -1) And then We put the data to this table so fast, so memStore will flush to storefile, and cause the compact task,unexpected thing happens: the storefiles's count keeps on increasing all the time. seeing the debug log : {code:title=hbase-regionServer.log|borderStyle=solid} 2014-08-17 15:41:02,689 DEBUG [regionserver60020-smallCompactions-1408258247722] regionserver.CompactSplitThread: CompactSplitThread Status: compaction_queue=(0:1), split_queue=0, merge_queue=0 2014-08-17 15:41:02,689 DEBUG [regionserver60020-smallCompactions-1408258247722] compactions.RatioBasedCompactionPolicy: Selecting compaction from 9 store files, 0 compacting, 9 eligible, 10 blocking 2014-08-17 15:41:02,689 INFO [regionserver60020-smallCompactions-1408258247722] compactions.RatioBasedCompactionPolicy: Deleting the expired store file by compaction: hdfs://hbase:9000/hbase/data/default/top_subchannel_2/0b47596c0bff1a60cf749cf1101eb642/s/c6392d54411a46cbb19350d706a298be whose maxTimeStamp is -1 while the max expired timestamp is 1408257662689 2014-08-17 15:41:02,689 DEBUG [regionserver60020-smallCompactions-1408258247722] regionserver.HStore: 0b47596c0bff1a60cf749cf1101eb642 - s: Initiating minor compaction 2014-08-17 15:41:02,689 INFO [regionserver60020-smallCompactions-1408258247722] regionserver.HRegion: Starting compaction on s in region top_subchannel_2,,1407982287422.0b47596c0bff1a60cf749cf1101eb642. 2014-08-17 15:41:02,689 INFO [regionserver60020-smallCompactions-1408258247722] regionserver.HStore: Starting compaction of 1 file(s) in s of top_subchannel_2,,1407982287422.0b47596c0bff1a60cf749cf1101eb642. into tmpdir=hdfs://hbase:9000/hbase/data/default/top_subchannel_2/0b47596c0bff1a60cf749cf1101eb642/.tmp, totalSize=564 2014-08-17 15:41:02,689 DEBUG [regionserver60020-smallCompactions-1408258247722] compactions.Compactor: Compacting hdfs://hbase:9000/hbase/data/default/top_subchannel_2/0b47596c0bff1a60cf749cf1101eb642/s/c6392d54411a46cbb19350d706a298be, keycount=0, bloomtype=NONE, size=564, encoding=FAST_DIFF, seqNum=45561 2014-08-17 15:41:02,711 INFO [regionserver60020-smallCompactions-1408258247722] regionserver.StoreFile: HFile Bloom filter type for f2e60ae4574a4d6eb89745d43582e9b4: NONE, but ROW specified in column family configuration 2014-08-17 15:41:02,713 DEBUG [regionserver60020-smallCompactions-1408258247722] regionserver.HRegionFileSystem: Committing store file hdfs://hbase:9000/hbase/data/default/top_subchannel_2/0b47596c0bff1a60cf749cf1101eb642/.tmp/f2e60ae4574a4d6eb89745d43582e9b4 as hdfs://hbase:9000/hbase/data/default/top_subchannel_2/0b47596c0bff1a60cf749cf1101eb642/s/f2e60ae4574a4d6eb89745d43582e9b4 2014-08-17 15:41:02,726 INFO [regionserver60020-smallCompactions-1408258247722] regionserver.StoreFile: HFile Bloom filter type for f2e60ae4574a4d6eb89745d43582e9b4: NONE, but ROW specified in column family configuration 2014-08-17 15:41:02,727 DEBUG [regionserver60020-smallCompactions-1408258247722] regionserver.HStore: Removing store files after compaction... 2014-08-17 15:41:02,731 DEBUG [regionserver60020-smallCompactions-1408258247722] backup.HFileArchiver: Finished archiving from class org.apache.hadoop.hbase.backup.HFileArchiver$FileableStoreFile, file:hdfs://hbase:9000/hbase/data/default/top_subchannel_2/0b47596c0bff1a60cf749cf1101eb642/s/c6392d54411a46cbb19350d706a298be, to hdfs://hbase:9000/hbase/archive/data/default/top_subchannel_2/0b47596c0bff1a60cf749cf1101eb642/s/c6392d54411a46cbb19350d706a298be 2014-08-17 15:41:02,731 INFO [regionserver60020-smallCompactions-1408258247722] regionserver.HStore: Completed compaction of 1 file(s) in s of top_subchannel_2,,1407982287422.0b47596c0bff1a60cf749cf1101eb642. into f2e60ae4574a4d6eb89745d43582e9b4(size=564), total size for store is 25.8 M. This selection was in queue for 0sec, and took 0sec to execute. 2014-08-17 15:41:02,731 INFO
[jira] [Updated] (HBASE-11339) HBase MOB
[ https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Jiajia updated HBASE-11339: -- Attachment: MOB user guide .docx update the mob user guide. HBase MOB - Key: HBASE-11339 URL: https://issues.apache.org/jira/browse/HBASE-11339 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Reporter: Jingcheng Du Assignee: Jingcheng Du Attachments: HBase MOB Design-v2.pdf, HBase MOB Design-v3.pdf, HBase MOB Design-v4.pdf, HBase MOB Design.pdf, MOB user guide .docx, hbase-11339-in-dev.patch It's quite useful to save the medium binary data like images, documents into Apache HBase. Unfortunately directly saving the binary MOB(medium object) to HBase leads to a worse performance since the frequent split and compaction. In this design, the MOB data are stored in an more efficient way, which keeps a high write/read performance and guarantees the data consistency in Apache HBase. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11339) HBase MOB
[ https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Jiajia updated HBASE-11339: -- Attachment: (was: MOB user guide .docx) HBase MOB - Key: HBASE-11339 URL: https://issues.apache.org/jira/browse/HBASE-11339 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Reporter: Jingcheng Du Assignee: Jingcheng Du Attachments: HBase MOB Design-v2.pdf, HBase MOB Design-v3.pdf, HBase MOB Design-v4.pdf, HBase MOB Design.pdf, MOB user guide .docx, hbase-11339-in-dev.patch It's quite useful to save the medium binary data like images, documents into Apache HBase. Unfortunately directly saving the binary MOB(medium object) to HBase leads to a worse performance since the frequent split and compaction. In this design, the MOB data are stored in an more efficient way, which keeps a high write/read performance and guarantees the data consistency in Apache HBase. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11757) Provide a common base abstract class for both RegionObserver and MasterObserver
[ https://issues.apache.org/jira/browse/HBASE-11757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matteo Bertozzi updated HBASE-11757: Fix Version/s: (was: 1.0.0) 0.99.0 Provide a common base abstract class for both RegionObserver and MasterObserver --- Key: HBASE-11757 URL: https://issues.apache.org/jira/browse/HBASE-11757 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Matteo Bertozzi Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11757-0.98-v0.patch, HBASE-11757-v0.patch Some security coprocessors extend both RegionObserver and MasterObserver, unfortunately only one of the two can use the available base abstract class implementations. Provide a common base abstract class for both the RegionObserver and MasterObserver interfaces. Update current coprocessors that extend both interfaces to use the new common base abstract class. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11762) Record the class name of Codec in WAL header
[ https://issues.apache.org/jira/browse/HBASE-11762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101985#comment-14101985 ] Hudson commented on HBASE-11762: FAILURE: Integrated in HBase-1.0 #110 (See [https://builds.apache.org/job/HBase-1.0/110/]) HBASE-11762 Record the class name of Codec in WAL header (tedyu: rev 12478cded70bfe375411e110deeca26db3484b2b) * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestCustomWALCellCodec.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogWriter.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALCellCodec.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ReaderBase.java * hbase-protocol/src/main/protobuf/WAL.proto * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogReaderOnSecureHLog.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogReader.java Record the class name of Codec in WAL header Key: HBASE-11762 URL: https://issues.apache.org/jira/browse/HBASE-11762 Project: HBase Issue Type: Task Components: wal Reporter: Ted Yu Assignee: Ted Yu Fix For: 1.0.0, 2.0.0, 0.98.6 Attachments: 11762-0.98.txt, 11762-v1.txt, 11762-v2.txt, 11762-v4.txt, 11762-v5.txt, 11762-v6.txt In follow-up discussion to HBASE-11620, Enis brought up this point: Related to this, should not we also write the CellCodec that we use in the WAL header. Right now, the codec comes from the configuration which means that you cannot read back the WAL files if you change the codec. This JIRA is to implement the above suggestion. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-4955) Use the official versions of surefire junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101987#comment-14101987 ] Nicolas Liochon commented on HBASE-4955: I suppose by deal breaker for you mean fork vs. threads? It's not at all a deal breaker. And may be it's actually the same thing. Surefire renamed its configuration parameters, the new names are better. Previously, threads could mean fork. Here what we have today: - small tests are executed in a single jvm, single thread. This could be multitreaded. The idea here is that these tests are very small, so a fork is expensive. if something else work it's fine. The issue I had initially with fork here was OOM because I had too many JVM (the fork happens while the previous process is still alive), but it was a lng time ago. - all other tests are executed with a fork per test class (even if the parameter says thread, it's actually a fork). [~posix4e], if you have something working it's just great :-). Use the official versions of surefire junit - Key: HBASE-4955 URL: https://issues.apache.org/jira/browse/HBASE-4955 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0, 0.98.0, 0.96.0, 0.99.0 Environment: all Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Critical Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 4955.v2.patch, 4955.v2.patch, 4955.v3.patch, 4955.v3.patch, 4955.v3.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v5.patch, 4955.v6.patch, 4955.v7.patch, 4955.v7.patch, 4955.v8.patch, 4955.v9.patch, 8204.v4.patch We currently use private versions for Surefire JUnit since HBASE-4763. This JIRA traks what we need to move to official versions. Surefire 2.11 is just out, but, after some tests, it does not contain all what we need. JUnit. Could be for JUnit 4.11. Issue to monitor: https://github.com/KentBeck/junit/issues/359: fixed in our version, no feedback for an integration on trunk Surefire: Could be for Surefire 2.12. Issues to monitor are: 329 (category support): fixed, we use the official implementation from the trunk 786 (@Category with forkMode=always): fixed, we use the official implementation from the trunk 791 (incorrect elapsed time on test failure): fixed, we use the official implementation from the trunk 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on our version. 760 (does not take into account the test method): fixed in trunk, not fixed in our version 798 (print immediately the test class name): not fixed in trunk, not fixed in our version 799 (Allow test parallelization when forkMode=always): not fixed in trunk, not fixed in our version 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, fixed on our version 800 793 are the more important to monitor, it's the only ones that are fixed in our version but not on trunk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-4955) Use the official versions of surefire junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101993#comment-14101993 ] Alex Newman commented on HBASE-4955: I am testing it on my local buildserver and I seem to be having some issues. org.apache.hadoop.hbase.http.TestServletFilter.testServletFilter fails with https://gist.github.com/posix4e/4512c3e6ca49ed1a04ac Use the official versions of surefire junit - Key: HBASE-4955 URL: https://issues.apache.org/jira/browse/HBASE-4955 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0, 0.98.0, 0.96.0, 0.99.0 Environment: all Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Critical Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 4955.v2.patch, 4955.v2.patch, 4955.v3.patch, 4955.v3.patch, 4955.v3.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v5.patch, 4955.v6.patch, 4955.v7.patch, 4955.v7.patch, 4955.v8.patch, 4955.v9.patch, 8204.v4.patch We currently use private versions for Surefire JUnit since HBASE-4763. This JIRA traks what we need to move to official versions. Surefire 2.11 is just out, but, after some tests, it does not contain all what we need. JUnit. Could be for JUnit 4.11. Issue to monitor: https://github.com/KentBeck/junit/issues/359: fixed in our version, no feedback for an integration on trunk Surefire: Could be for Surefire 2.12. Issues to monitor are: 329 (category support): fixed, we use the official implementation from the trunk 786 (@Category with forkMode=always): fixed, we use the official implementation from the trunk 791 (incorrect elapsed time on test failure): fixed, we use the official implementation from the trunk 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on our version. 760 (does not take into account the test method): fixed in trunk, not fixed in our version 798 (print immediately the test class name): not fixed in trunk, not fixed in our version 799 (Allow test parallelization when forkMode=always): not fixed in trunk, not fixed in our version 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, fixed on our version 800 793 are the more important to monitor, it's the only ones that are fixed in our version but not on trunk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10092) Move up on to log4j2
[ https://issues.apache.org/jira/browse/HBASE-10092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101994#comment-14101994 ] Nicolas Liochon commented on HBASE-10092: - The issue I had with common-logging in the past was the lack of MDC (I wanted to log the htrace id). All other wrappers support MDC. But naive question: would it make sense to use directly log4j2 instead of using a wrapper? Move up on to log4j2 Key: HBASE-10092 URL: https://issues.apache.org/jira/browse/HBASE-10092 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: Alex Newman Fix For: 2.0.0 Attachments: 10092.txt, 10092v2.txt, HBASE-10092.patch Allows logging with less friction. See http://logging.apache.org/log4j/2.x/ This rather radical transition can be done w/ minor change given they have an adapter for apache's logging, the one we use. They also have and adapter for slf4j so we likely can remove at least some of the 4 versions of this module our dependencies make use of. I made a start in attached patch but am currently stuck in maven dependency resolve hell courtesy of our slf4j. Fixing will take some concentration and a good net connection, an item I currently lack. Other TODOs are that will need to fix our little log level setting jsp page -- will likely have to undo our use of hadoop's tool here -- and the config system changes a little. I will return to this project soon. Will bring numbers. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-4955) Use the official versions of surefire junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101991#comment-14101991 ] Alex Newman commented on HBASE-4955: You can watch my progress at https://github.com/Ohmdata/hbase-public/tree/4955 Use the official versions of surefire junit - Key: HBASE-4955 URL: https://issues.apache.org/jira/browse/HBASE-4955 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0, 0.98.0, 0.96.0, 0.99.0 Environment: all Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Critical Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 4955.v2.patch, 4955.v2.patch, 4955.v3.patch, 4955.v3.patch, 4955.v3.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v5.patch, 4955.v6.patch, 4955.v7.patch, 4955.v7.patch, 4955.v8.patch, 4955.v9.patch, 8204.v4.patch We currently use private versions for Surefire JUnit since HBASE-4763. This JIRA traks what we need to move to official versions. Surefire 2.11 is just out, but, after some tests, it does not contain all what we need. JUnit. Could be for JUnit 4.11. Issue to monitor: https://github.com/KentBeck/junit/issues/359: fixed in our version, no feedback for an integration on trunk Surefire: Could be for Surefire 2.12. Issues to monitor are: 329 (category support): fixed, we use the official implementation from the trunk 786 (@Category with forkMode=always): fixed, we use the official implementation from the trunk 791 (incorrect elapsed time on test failure): fixed, we use the official implementation from the trunk 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on our version. 760 (does not take into account the test method): fixed in trunk, not fixed in our version 798 (print immediately the test class name): not fixed in trunk, not fixed in our version 799 (Allow test parallelization when forkMode=always): not fixed in trunk, not fixed in our version 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, fixed on our version 800 793 are the more important to monitor, it's the only ones that are fixed in our version but not on trunk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-4955) Use the official versions of surefire junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101992#comment-14101992 ] Alex Newman commented on HBASE-4955: I am testing it on my local buildserver and I seem to be having some issues. org.apache.hadoop.hbase.http.TestServletFilter.testServletFilter fails with https://gist.github.com/posix4e/4512c3e6ca49ed1a04ac On Tue, Aug 19, 2014 at 12:53 AM, Nicolas Liochon (JIRA) Use the official versions of surefire junit - Key: HBASE-4955 URL: https://issues.apache.org/jira/browse/HBASE-4955 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0, 0.98.0, 0.96.0, 0.99.0 Environment: all Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Critical Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 4955.v2.patch, 4955.v2.patch, 4955.v3.patch, 4955.v3.patch, 4955.v3.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v5.patch, 4955.v6.patch, 4955.v7.patch, 4955.v7.patch, 4955.v8.patch, 4955.v9.patch, 8204.v4.patch We currently use private versions for Surefire JUnit since HBASE-4763. This JIRA traks what we need to move to official versions. Surefire 2.11 is just out, but, after some tests, it does not contain all what we need. JUnit. Could be for JUnit 4.11. Issue to monitor: https://github.com/KentBeck/junit/issues/359: fixed in our version, no feedback for an integration on trunk Surefire: Could be for Surefire 2.12. Issues to monitor are: 329 (category support): fixed, we use the official implementation from the trunk 786 (@Category with forkMode=always): fixed, we use the official implementation from the trunk 791 (incorrect elapsed time on test failure): fixed, we use the official implementation from the trunk 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on our version. 760 (does not take into account the test method): fixed in trunk, not fixed in our version 798 (print immediately the test class name): not fixed in trunk, not fixed in our version 799 (Allow test parallelization when forkMode=always): not fixed in trunk, not fixed in our version 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, fixed on our version 800 793 are the more important to monitor, it's the only ones that are fixed in our version but not on trunk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10092) Move up on to log4j2
[ https://issues.apache.org/jira/browse/HBASE-10092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14101997#comment-14101997 ] Alex Newman commented on HBASE-10092: - The good news is that I am fairly sure unit tests will not be an issue with log4j2. As far as using it directly, I am game. But it's a much larger change. I think I am very close on this one. Move up on to log4j2 Key: HBASE-10092 URL: https://issues.apache.org/jira/browse/HBASE-10092 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: Alex Newman Fix For: 2.0.0 Attachments: 10092.txt, 10092v2.txt, HBASE-10092.patch Allows logging with less friction. See http://logging.apache.org/log4j/2.x/ This rather radical transition can be done w/ minor change given they have an adapter for apache's logging, the one we use. They also have and adapter for slf4j so we likely can remove at least some of the 4 versions of this module our dependencies make use of. I made a start in attached patch but am currently stuck in maven dependency resolve hell courtesy of our slf4j. Fixing will take some concentration and a good net connection, an item I currently lack. Other TODOs are that will need to fix our little log level setting jsp page -- will likely have to undo our use of hadoop's tool here -- and the config system changes a little. I will return to this project soon. Will bring numbers. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11613) get_counter shell command is not displaying the result for counter columns.
[ https://issues.apache.org/jira/browse/HBASE-11613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102003#comment-14102003 ] Y. SREENIVASULU REDDY commented on HBASE-11613: --- [~jmspaggi] get_counter 't', 'r1', 'f:c1', 'dummy' the above one is working correctly. need to update in get_counter help command same for better understanding. get_counter shell command is not displaying the result for counter columns. - Key: HBASE-11613 URL: https://issues.apache.org/jira/browse/HBASE-11613 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.98.3 Reporter: Y. SREENIVASULU REDDY Priority: Minor perform the following opertions in HBase shell prompt. 1. create a table with one column family. 2. insert some amount of data into the table. 3. then perform increment operation on any column qualifier. eg: incr 't', 'r1', 'f:c1' 4. then queried the get counter query, it is throwing nocounter found message to the user. {code} eg: hbase(main):010:0 get_counter 't', 'r1', 'f', 'c1' No counter found at specified coordinates {code} = and wrong message is throwing to user, while executing the get_counter query. {code} hbase(main):009:0 get_counter 't', 'r1', 'f' ERROR: wrong number of arguments (3 for 4) Here is some help for this command: Return a counter cell value at specified table/row/column coordinates. A cell cell should be managed with atomic increment function oh HBase and the data should be binary encoded. Example: hbase get_counter 'ns1:t1', 'r1', 'c1' hbase get_counter 't1', 'r1', 'c1' The same commands also can be run on a table reference. Suppose you had a reference t to table 't1', the corresponding command would be: hbase t.get_counter 'r1', 'c1' {code} {code} problem: In example they given 3 arguments but asking 4 arguments If run with 3 arguments it will throw error. if run with 4 arguments No counter found at specified coordinates message is throwing even though counter is specified. {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11610) Enhance remote meta updates
[ https://issues.apache.org/jira/browse/HBASE-11610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102008#comment-14102008 ] Nicolas Liochon commented on HBASE-11610: - The perf improvement is great. I'm not a big fan of the ThreadLocalHTableInterface threadLocalHTable. It's often difficult to maintain and test. If I understand well the issue if that the put is synchronous, and all the threads are queueing? Should we use something like HConnection#processBatchCallback(List? extends Row list, final TableName tableName, ExecutorService pool, Object[] results, Batch.CallbackR callback) throws IOException, InterruptedException; instead of using HTable? HConnection is thread safe, so there is no sync needed. (ok it's deprecated, but if it saves this kind of hack may be we need to review our point of view). Enhance remote meta updates --- Key: HBASE-11610 URL: https://issues.apache.org/jira/browse/HBASE-11610 Project: HBase Issue Type: Sub-task Reporter: Jimmy Xiang Assignee: Virag Kothari Attachments: HBASE-11610.patch Currently, if the meta region is on a regionserver instead of the master, meta update is synchronized on one HTable instance. We should be able to do better. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11742) Backport HBASE-7987 and HBASE-11185 to 0.98
[ https://issues.apache.org/jira/browse/HBASE-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102006#comment-14102006 ] Matteo Bertozzi commented on HBASE-11742: - +1 looks good to me Backport HBASE-7987 and HBASE-11185 to 0.98 --- Key: HBASE-11742 URL: https://issues.apache.org/jira/browse/HBASE-11742 Project: HBase Issue Type: Improvement Components: mapreduce, snapshots Affects Versions: 0.98.5 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Fix For: 0.98.6 Attachments: HBASE-11742.v0.patch, HBASE-11742.v1.patch HBASE-7987 improves how snapshots are handled via a manifest file. This requires reverting HBASE-11360 since introduces an alternate functionality that is not compatible with HBASE-7987. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-4955) Use the official versions of surefire junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102017#comment-14102017 ] Nicolas Liochon commented on HBASE-4955: It's different from what I was having a few months ago on the Apache build: my tests were just hanging. If it's the only test that fails it's a good progress already. Use the official versions of surefire junit - Key: HBASE-4955 URL: https://issues.apache.org/jira/browse/HBASE-4955 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0, 0.98.0, 0.96.0, 0.99.0 Environment: all Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Critical Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 4955.v2.patch, 4955.v2.patch, 4955.v3.patch, 4955.v3.patch, 4955.v3.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v5.patch, 4955.v6.patch, 4955.v7.patch, 4955.v7.patch, 4955.v8.patch, 4955.v9.patch, 8204.v4.patch We currently use private versions for Surefire JUnit since HBASE-4763. This JIRA traks what we need to move to official versions. Surefire 2.11 is just out, but, after some tests, it does not contain all what we need. JUnit. Could be for JUnit 4.11. Issue to monitor: https://github.com/KentBeck/junit/issues/359: fixed in our version, no feedback for an integration on trunk Surefire: Could be for Surefire 2.12. Issues to monitor are: 329 (category support): fixed, we use the official implementation from the trunk 786 (@Category with forkMode=always): fixed, we use the official implementation from the trunk 791 (incorrect elapsed time on test failure): fixed, we use the official implementation from the trunk 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on our version. 760 (does not take into account the test method): fixed in trunk, not fixed in our version 798 (print immediately the test class name): not fixed in trunk, not fixed in our version 799 (Allow test parallelization when forkMode=always): not fixed in trunk, not fixed in our version 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, fixed on our version 800 793 are the more important to monitor, it's the only ones that are fixed in our version but not on trunk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11762) Record the class name of Codec in WAL header
[ https://issues.apache.org/jira/browse/HBASE-11762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102037#comment-14102037 ] Hudson commented on HBASE-11762: FAILURE: Integrated in HBase-TRUNK #5409 (See [https://builds.apache.org/job/HBase-TRUNK/5409/]) HBASE-11762 Record the class name of Codec in WAL header (tedyu: rev fd4dfb489aa4100b9bd204ad70e4ae590db93b32) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogReader.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestCustomWALCellCodec.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogReaderOnSecureHLog.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogWriter.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ReaderBase.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALCellCodec.java * hbase-protocol/src/main/protobuf/WAL.proto * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java Record the class name of Codec in WAL header Key: HBASE-11762 URL: https://issues.apache.org/jira/browse/HBASE-11762 Project: HBase Issue Type: Task Components: wal Reporter: Ted Yu Assignee: Ted Yu Fix For: 1.0.0, 2.0.0, 0.98.6 Attachments: 11762-0.98.txt, 11762-v1.txt, 11762-v2.txt, 11762-v4.txt, 11762-v5.txt, 11762-v6.txt In follow-up discussion to HBASE-11620, Enis brought up this point: Related to this, should not we also write the CellCodec that we use in the WAL header. Right now, the codec comes from the configuration which means that you cannot read back the WAL files if you change the codec. This JIRA is to implement the above suggestion. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10728) get_counter value is never used.
[ https://issues.apache.org/jira/browse/HBASE-10728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102051#comment-14102051 ] Y. SREENIVASULU REDDY commented on HBASE-10728: --- +1 for 0.98 patch while commit please handle this comment https://issues.apache.org/jira/browse/HBASE-11613?focusedCommentId=14092624page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14092624 get_counter value is never used. Key: HBASE-10728 URL: https://issues.apache.org/jira/browse/HBASE-10728 Project: HBase Issue Type: Bug Affects Versions: 0.96.2, 0.98.1, 0.99.0 Reporter: Jean-Marc Spaggiari Assignee: Jean-Marc Spaggiari Attachments: HBASE-10728-v0-0.96.patch, HBASE-10728-v0-0.98.patch, HBASE-10728-v0-trunk.patch, HBASE-10728-v1-0.96.patch, HBASE-10728-v1-0.98.patch, HBASE-10728-v1-trunk.patch, HBASE-10728-v2-trunk.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11613) get_counter shell command is not displaying the result for counter columns.
[ https://issues.apache.org/jira/browse/HBASE-11613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102052#comment-14102052 ] Y. SREENIVASULU REDDY commented on HBASE-11613: --- I verified the patch for issue HBASE-10728 in 0.98.x version it is working fine Resolve issue setting to duplicate. get_counter shell command is not displaying the result for counter columns. - Key: HBASE-11613 URL: https://issues.apache.org/jira/browse/HBASE-11613 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.98.3 Reporter: Y. SREENIVASULU REDDY Priority: Minor perform the following opertions in HBase shell prompt. 1. create a table with one column family. 2. insert some amount of data into the table. 3. then perform increment operation on any column qualifier. eg: incr 't', 'r1', 'f:c1' 4. then queried the get counter query, it is throwing nocounter found message to the user. {code} eg: hbase(main):010:0 get_counter 't', 'r1', 'f', 'c1' No counter found at specified coordinates {code} = and wrong message is throwing to user, while executing the get_counter query. {code} hbase(main):009:0 get_counter 't', 'r1', 'f' ERROR: wrong number of arguments (3 for 4) Here is some help for this command: Return a counter cell value at specified table/row/column coordinates. A cell cell should be managed with atomic increment function oh HBase and the data should be binary encoded. Example: hbase get_counter 'ns1:t1', 'r1', 'c1' hbase get_counter 't1', 'r1', 'c1' The same commands also can be run on a table reference. Suppose you had a reference t to table 't1', the corresponding command would be: hbase t.get_counter 'r1', 'c1' {code} {code} problem: In example they given 3 arguments but asking 4 arguments If run with 3 arguments it will throw error. if run with 4 arguments No counter found at specified coordinates message is throwing even though counter is specified. {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HBASE-11613) get_counter shell command is not displaying the result for counter columns.
[ https://issues.apache.org/jira/browse/HBASE-11613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Y. SREENIVASULU REDDY reassigned HBASE-11613: - Assignee: Y. SREENIVASULU REDDY get_counter shell command is not displaying the result for counter columns. - Key: HBASE-11613 URL: https://issues.apache.org/jira/browse/HBASE-11613 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.98.3 Reporter: Y. SREENIVASULU REDDY Assignee: Y. SREENIVASULU REDDY Priority: Minor perform the following opertions in HBase shell prompt. 1. create a table with one column family. 2. insert some amount of data into the table. 3. then perform increment operation on any column qualifier. eg: incr 't', 'r1', 'f:c1' 4. then queried the get counter query, it is throwing nocounter found message to the user. {code} eg: hbase(main):010:0 get_counter 't', 'r1', 'f', 'c1' No counter found at specified coordinates {code} = and wrong message is throwing to user, while executing the get_counter query. {code} hbase(main):009:0 get_counter 't', 'r1', 'f' ERROR: wrong number of arguments (3 for 4) Here is some help for this command: Return a counter cell value at specified table/row/column coordinates. A cell cell should be managed with atomic increment function oh HBase and the data should be binary encoded. Example: hbase get_counter 'ns1:t1', 'r1', 'c1' hbase get_counter 't1', 'r1', 'c1' The same commands also can be run on a table reference. Suppose you had a reference t to table 't1', the corresponding command would be: hbase t.get_counter 'r1', 'c1' {code} {code} problem: In example they given 3 arguments but asking 4 arguments If run with 3 arguments it will throw error. if run with 4 arguments No counter found at specified coordinates message is throwing even though counter is specified. {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HBASE-11613) get_counter shell command is not displaying the result for counter columns.
[ https://issues.apache.org/jira/browse/HBASE-11613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Y. SREENIVASULU REDDY resolved HBASE-11613. --- Resolution: Duplicate get_counter shell command is not displaying the result for counter columns. - Key: HBASE-11613 URL: https://issues.apache.org/jira/browse/HBASE-11613 Project: HBase Issue Type: Bug Components: shell Affects Versions: 0.98.3 Reporter: Y. SREENIVASULU REDDY Assignee: Y. SREENIVASULU REDDY Priority: Minor perform the following opertions in HBase shell prompt. 1. create a table with one column family. 2. insert some amount of data into the table. 3. then perform increment operation on any column qualifier. eg: incr 't', 'r1', 'f:c1' 4. then queried the get counter query, it is throwing nocounter found message to the user. {code} eg: hbase(main):010:0 get_counter 't', 'r1', 'f', 'c1' No counter found at specified coordinates {code} = and wrong message is throwing to user, while executing the get_counter query. {code} hbase(main):009:0 get_counter 't', 'r1', 'f' ERROR: wrong number of arguments (3 for 4) Here is some help for this command: Return a counter cell value at specified table/row/column coordinates. A cell cell should be managed with atomic increment function oh HBase and the data should be binary encoded. Example: hbase get_counter 'ns1:t1', 'r1', 'c1' hbase get_counter 't1', 'r1', 'c1' The same commands also can be run on a table reference. Suppose you had a reference t to table 't1', the corresponding command would be: hbase t.get_counter 'r1', 'c1' {code} {code} problem: In example they given 3 arguments but asking 4 arguments If run with 3 arguments it will throw error. if run with 4 arguments No counter found at specified coordinates message is throwing even though counter is specified. {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11728) Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING
[ https://issues.apache.org/jira/browse/HBASE-11728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102063#comment-14102063 ] wuchengzhi commented on HBASE-11728: [~ram_krish] ok,i got it, we will try to upgrade the version, thanks for your remind. if I just replace the lastest prefix-tree-xxx.jar , can it work? Data loss while scanning using PREFIX_TREE DATA-BLOCK-ENCODING -- Key: HBASE-11728 URL: https://issues.apache.org/jira/browse/HBASE-11728 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.96.1.1, 0.98.4 Environment: ubuntu12 hadoop-2.2.0 Hbase-0.96.1.1 SUN-JDK(1.7.0_06-b24) Reporter: wuchengzhi Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: 29cb562fad564b468ea9d61a2d60e8b0, HBASE-11728.patch, HBASE-11728_1.patch, HBASE-11728_2.patch, HBASE-11728_3.patch, HBASE-11728_4.patch, HFileAnalys.java, TestPrefixTree.java Original Estimate: 72h Remaining Estimate: 72h In Scan case, i prepare some data as beflow: Table Desc (Using the prefix-tree encoding) : 'prefix_tree_test', {NAME = 'cf_1', DATA_BLOCK_ENCODING = 'PREFIX_TREE', TTL = '15552000'} and i put 5 rows as: (RowKey , Qualifier, Value) 'a-b-0-0', 'qf_1', 'c1-value' 'a-b-A-1', 'qf_1', 'c1-value' 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value' 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2' 'a-b-B-2-1402397300-1402416535', 'qf_2', 'c2-value-3' so i try to scan the rowKey between 'a-b-A-1' and 'a-b-A-1:' , i and got the corret result: Test 1: Scan scan = new Scan(); scan.setStartRow(a-b-A-1.getBytes()); scan.setStopRow(a-b-A-1:.getBytes()); -- 'a-b-A-1', 'qf_1', 'c1-value' 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value' 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2' and then i try next , scan to addColumn Test2: Scan scan = new Scan(); scan.addColumn(Bytes.toBytes(cf_1) , Bytes.toBytes(qf_2)); scan.setStartRow(a-b-A-1.getBytes()); scan.setStopRow(a-b-A-1:.getBytes()); -- except: 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value' 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2' but actually i got nonthing. Then i update the addColumn for scan.addColumn(Bytes.toBytes(cf_1) , Bytes.toBytes(qf_1)); and i got the expected result 'a-b-A-1', 'qf_1', 'c1-value' as well. then i do more testing... i update the case to modify the startRow greater than the 'a-b-A-1' Test3: Scan scan = new Scan(); scan.setStartRow(a-b-A-1-.getBytes()); scan.setStopRow(a-b-A-1:.getBytes()); -- except: 'a-b-A-1-1402329600-1402396277', 'qf_2', 'c2-value' 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2' but actually i got nothing again. i modify the start row greater than 'a-b-A-1-1402329600-1402396277' Scan scan = new Scan(); scan.setStartRow(a-b-A-1-140239.getBytes()); scan.setStopRow(a-b-A-1:.getBytes()); and i got the expect row as well: 'a-b-A-1-1402397227-1402415999', 'qf_2', 'c2-value-2' So, i think it may be a bug in the prefix-tree encoding.It happens after the data flush to the storefile, and it's ok when the data in mem-store. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11762) Record the class name of Codec in WAL header
[ https://issues.apache.org/jira/browse/HBASE-11762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102087#comment-14102087 ] Hudson commented on HBASE-11762: SUCCESS: Integrated in HBase-0.98 #457 (See [https://builds.apache.org/job/HBase-0.98/457/]) HBASE-11762 Record the class name of Codec in WAL header (tedyu: rev 3f38af605f3bcd3f61babc5e75b05ac7490e839e) * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestCustomWALCellCodec.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogWriter.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogWriter.java * hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/WALProtos.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/SecureProtobufLogReader.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ReaderBase.java * hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogReaderOnSecureHLog.java * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALCellCodec.java * hbase-protocol/src/main/protobuf/WAL.proto Record the class name of Codec in WAL header Key: HBASE-11762 URL: https://issues.apache.org/jira/browse/HBASE-11762 Project: HBase Issue Type: Task Components: wal Reporter: Ted Yu Assignee: Ted Yu Fix For: 1.0.0, 2.0.0, 0.98.6 Attachments: 11762-0.98.txt, 11762-v1.txt, 11762-v2.txt, 11762-v4.txt, 11762-v5.txt, 11762-v6.txt In follow-up discussion to HBASE-11620, Enis brought up this point: Related to this, should not we also write the CellCodec that we use in the WAL header. Right now, the codec comes from the configuration which means that you cannot read back the WAL files if you change the codec. This JIRA is to implement the above suggestion. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11696) Make CombinedBlockCache resizable.
[ https://issues.apache.org/jira/browse/HBASE-11696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-11696: --- Resolution: Fixed Fix Version/s: 0.99.0 Release Note: CombinedBlockCache is made resizable. See HBASE-5349 for auto resizing feature. On resize of this block cache, the L1 cache (ie. LRU cache) will get resized Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Make CombinedBlockCache resizable. -- Key: HBASE-11696 URL: https://issues.apache.org/jira/browse/HBASE-11696 Project: HBase Issue Type: Improvement Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 0.99.0, 2.0.0 Attachments: HBASE-11696.patch HBASE-5349 adds auto tuning of memstore heap and block cache heap. Block cache needs to be resizable in order for this. CombinedBlockCache is not marked resizable now. We can make this. On resize the L1 cache (ie. LRU cache) can get resized. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11339) HBase MOB
[ https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102218#comment-14102218 ] Jonathan Hsieh commented on HBASE-11339: [~jiajia], thanks for the update to the user guide. I think it has the key details points (the whats) needed for a user who already understands what a MOB is and is for. We should add some context for users (the why's and the bigger picture) that aren't familiar with it thought but adding some background into this user doc. We'll eventually fold into the ref guide here[1]. Let me provide a quick draft that we could build off of. Before Bullet we should have some info (this is a paraphrased version of the design doc's intro. {quote} Data comes in many sizes, and it is convenient to save the binary data like images, documents into the HBase. While HBase can handle binary objects with cells that are 1 byte to 10MB long, HBase's normal read and write paths are optimized for values smaller than 100KB in size. When HBase deals with large numbers of values 100kb and up to ~10MB of data, it encounters performance degradations due to write amplification caused by splits and compactions. HBase 2.0+ has added support for better managing large numbers of *Medium Objects* (MOBs) that maintains the same high performance, strongly consistently characteristics with low operational overhead. To enable the feature, one must enable and config the mob components in each region server and enable the mob feature on particular column families during table creation or table alter. Also in the preview version of this feature, the admin must setup periodic processes that re-optimizes the layout of mob data. Section: Enabling and Configuring the mob feature on region servers. Need to enable feature in flushes and compactions. Tuning settings on caches. user doc bullet 1. edit hbase-site... user doc bullet 7. mob cache Would be nice to have an examples of doing this from the shell -- an example of creating a table with mob on a cf, and an example of a table alter that changes a cf to use the mob path. Section: Mob management The mob feature introduces a new read and write path to hbase and in its current incarnation requires external tools for housekeeping and reoptimization. There are two tools introduced -- the expiredMobFileCleaner for handling ttls and time based expiry of data, and the sweep tool for coalescing small mob files or mob files with many deletions or updates. user doc bullet 8. Section: Enabling the mob feature on user tables This can be done when creating a table or when altering a table user doc bullet 2 (set cf with mob) user doc bullet 6 (threshold size) To a client, mob cells act just like normal cells. user doc bullet 3 put user doc bullet 4 scan There is a special scanner mode users can use to read the raw values user doc bullet 5. {quote} [1] http://hbase.apache.org/book.html HBase MOB - Key: HBASE-11339 URL: https://issues.apache.org/jira/browse/HBASE-11339 Project: HBase Issue Type: Umbrella Components: regionserver, Scanners Reporter: Jingcheng Du Assignee: Jingcheng Du Attachments: HBase MOB Design-v2.pdf, HBase MOB Design-v3.pdf, HBase MOB Design-v4.pdf, HBase MOB Design.pdf, MOB user guide .docx, hbase-11339-in-dev.patch It's quite useful to save the medium binary data like images, documents into Apache HBase. Unfortunately directly saving the binary MOB(medium object) to HBase leads to a worse performance since the frequent split and compaction. In this design, the MOB data are stored in an more efficient way, which keeps a high write/read performance and guarantees the data consistency in Apache HBase. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HBASE-11772) Bulk load mvcc and seqId issues with native hfiles
[ https://issues.apache.org/jira/browse/HBASE-11772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102226#comment-14102226 ] Jean-Marc Spaggiari edited comment on HBASE-11772 at 8/19/14 2:09 PM: -- {code} /** - * @return true if this storefile was created by HFileOutputFormat + * @return true if this storefile was created by bulk load. * for a bulk load. */ {code} You might want to remove the for a bulk load line too. was (Author: jmspaggi): {quote} /** - * @return true if this storefile was created by HFileOutputFormat + * @return true if this storefile was created by bulk load. * for a bulk load. */ {quote} You might want to remove the for a bulk load line too. Bulk load mvcc and seqId issues with native hfiles -- Key: HBASE-11772 URL: https://issues.apache.org/jira/browse/HBASE-11772 Project: HBase Issue Type: Bug Affects Versions: 0.98.5 Reporter: Jerry He Assignee: Jerry He Priority: Critical Fix For: 0.98.6 Attachments: HBASE-11772-0.98.patch There are mvcc and seqId issues when bulk load native hfiles -- meaning hfiles that are direct file copy-out from hbase, not from HFileOutputFormat job. There are differences between these two types of hfiles. Native hfiles have possible non-zero MAX_MEMSTORE_TS_KEY value and non-zero mvcc values in cells. Native hfiles also have MAX_SEQ_ID_KEY. Native hfiles do not have BULKLOAD_TIME_KEY. Here are a couple of problems I observed when bulk load native hfiles. 1. Cells in newly bulk loaded hfiles can be invisible to scan. It is easy to re-create. Bulk load a native hfile that has a larger mvcc value in cells, e.g 10 If the current readpoint when initiating a scan is less than 10, the cells in the new hfile are skipped, thus become invisible. We don't reset the readpoint of a region after bulk load. 2. The current StoreFile.isBulkLoadResult() is implemented as: {code} return metadataMap.containsKey(BULKLOAD_TIME_KEY) {code} which does not detect bulkloaded native hfiles. 3. Another observed problem is possible data loss during log recovery. It is similar to HBASE-10958 reported by [~jdcryans]. Borrow the re-create steps from HBASE-10958. 1) Create an empty table 2) Put one row in it (let's say it gets seqid 1) 3) Bulk load one native hfile with large seqId ( e.g. 100). The native hfile can be obtained by copying out from existing table. 4) Kill the region server that holds the table's region. Scan the table once the region is made available again. The first row, at seqid 1, will be missing since the HFile with seqid 100 makes us believe that everything that came before it was flushed. The problem 3 is probably related to 2. We will be ok if we get the appended seqId during bulk load instead of 100 from inside the file. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11772) Bulk load mvcc and seqId issues with native hfiles
[ https://issues.apache.org/jira/browse/HBASE-11772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102226#comment-14102226 ] Jean-Marc Spaggiari commented on HBASE-11772: - {quote} /** - * @return true if this storefile was created by HFileOutputFormat + * @return true if this storefile was created by bulk load. * for a bulk load. */ {quote} You might want to remove the for a bulk load line too. Bulk load mvcc and seqId issues with native hfiles -- Key: HBASE-11772 URL: https://issues.apache.org/jira/browse/HBASE-11772 Project: HBase Issue Type: Bug Affects Versions: 0.98.5 Reporter: Jerry He Assignee: Jerry He Priority: Critical Fix For: 0.98.6 Attachments: HBASE-11772-0.98.patch There are mvcc and seqId issues when bulk load native hfiles -- meaning hfiles that are direct file copy-out from hbase, not from HFileOutputFormat job. There are differences between these two types of hfiles. Native hfiles have possible non-zero MAX_MEMSTORE_TS_KEY value and non-zero mvcc values in cells. Native hfiles also have MAX_SEQ_ID_KEY. Native hfiles do not have BULKLOAD_TIME_KEY. Here are a couple of problems I observed when bulk load native hfiles. 1. Cells in newly bulk loaded hfiles can be invisible to scan. It is easy to re-create. Bulk load a native hfile that has a larger mvcc value in cells, e.g 10 If the current readpoint when initiating a scan is less than 10, the cells in the new hfile are skipped, thus become invisible. We don't reset the readpoint of a region after bulk load. 2. The current StoreFile.isBulkLoadResult() is implemented as: {code} return metadataMap.containsKey(BULKLOAD_TIME_KEY) {code} which does not detect bulkloaded native hfiles. 3. Another observed problem is possible data loss during log recovery. It is similar to HBASE-10958 reported by [~jdcryans]. Borrow the re-create steps from HBASE-10958. 1) Create an empty table 2) Put one row in it (let's say it gets seqid 1) 3) Bulk load one native hfile with large seqId ( e.g. 100). The native hfile can be obtained by copying out from existing table. 4) Kill the region server that holds the table's region. Scan the table once the region is made available again. The first row, at seqid 1, will be missing since the HFile with seqid 100 makes us believe that everything that came before it was flushed. The problem 3 is probably related to 2. We will be ok if we get the appended seqId during bulk load instead of 100 from inside the file. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11774) Avoid allocating unnecessary tag iterators
[ https://issues.apache.org/jira/browse/HBASE-11774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102282#comment-14102282 ] Anoop Sam John commented on HBASE-11774: FYI.. I will commit HBASE-11553 in some time. I am incorporating the changes from this patch to visibility classes. bq.CellUtil.tagsIterator() used without the tags length check in VisibilityUtils also. Might be good to fix there also? This part also I am fixing. You can just avoid the visibility classes from the patch. Avoid allocating unnecessary tag iterators -- Key: HBASE-11774 URL: https://issues.apache.org/jira/browse/HBASE-11774 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11774.patch We can avoid an unnecessary object allocation, sometimes in hot code paths, by not creating a tag iterator if the cell's tag area is of length zero, signifying no tags present. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11774) Avoid allocating unnecessary tag iterators
[ https://issues.apache.org/jira/browse/HBASE-11774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102283#comment-14102283 ] Anoop Sam John commented on HBASE-11774: FYI.. I will commit HBASE-11553 in some time. I am incorporating the changes from this patch to visibility classes. bq.CellUtil.tagsIterator() used without the tags length check in VisibilityUtils also. Might be good to fix there also? This part also I am fixing. You can just avoid the visibility classes from the patch. Avoid allocating unnecessary tag iterators -- Key: HBASE-11774 URL: https://issues.apache.org/jira/browse/HBASE-11774 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11774.patch We can avoid an unnecessary object allocation, sometimes in hot code paths, by not creating a tag iterator if the cell's tag area is of length zero, signifying no tags present. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11683) Metrics for MOB
[ https://issues.apache.org/jira/browse/HBASE-11683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102310#comment-14102310 ] Jonathan Hsieh commented on HBASE-11683: {quote} I'm thinking how to implement the #2 mob reads, is it okay to record how many times the scanner read from the mob files? I don't see HBase has metrics in the normal scanner, is it necessary for the mob read? Please advise. Thanks. {quote} I'm thinking about this from the point of view of someone trying to decide if they should use the mob or an operator verifying that the mobs are working. Flushes should cover the write side metrics. Ideally i'd want to know how much IO i'm saving or would save by using the mob feature and this helps me understand that. We'd probably want some compaction related mob counts as well. (# cells converted to mob, # converted from mob). However, I really do care about the reads side as well. It would be great actually if we got general sizes statistics for the cells when reading and stats on the mob caches as well. There are two places I'm thinking the data could be collected: * Adding a counter every time the mob dereferences cell (specific to mob) * Adding cell size count buckets that the server tracks when a Result is sent from a get/scan. Metrics for MOB --- Key: HBASE-11683 URL: https://issues.apache.org/jira/browse/HBASE-11683 Project: HBase Issue Type: Sub-task Components: regionserver, Scanners Affects Versions: 2.0.0 Reporter: Jonathan Hsieh Assignee: Jingcheng Du Attachments: HBASE-11683.diff We need to make sure to capture metrics about mobs. Some basic ones include: # of mob writes # of mob reads # avg size of mob (?) # mob files # of mob compactions / sweeps -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10092) Move up on to log4j2
[ https://issues.apache.org/jira/browse/HBASE-10092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102324#comment-14102324 ] Nicolas Liochon commented on HBASE-10092: - Yeah, using it directly is more in the scope of HBASE-11334. Putting a comment there. Move up on to log4j2 Key: HBASE-10092 URL: https://issues.apache.org/jira/browse/HBASE-10092 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: Alex Newman Fix For: 2.0.0 Attachments: 10092.txt, 10092v2.txt, HBASE-10092.patch Allows logging with less friction. See http://logging.apache.org/log4j/2.x/ This rather radical transition can be done w/ minor change given they have an adapter for apache's logging, the one we use. They also have and adapter for slf4j so we likely can remove at least some of the 4 versions of this module our dependencies make use of. I made a start in attached patch but am currently stuck in maven dependency resolve hell courtesy of our slf4j. Fixing will take some concentration and a good net connection, an item I currently lack. Other TODOs are that will need to fix our little log level setting jsp page -- will likely have to undo our use of hadoop's tool here -- and the config system changes a little. I will return to this project soon. Will bring numbers. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11334) Migrate to SLF4J as logging interface
[ https://issues.apache.org/jira/browse/HBASE-11334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102325#comment-14102325 ] Nicolas Liochon commented on HBASE-11334: - Would it make sense to use directly log4j2? Migrate to SLF4J as logging interface - Key: HBASE-11334 URL: https://issues.apache.org/jira/browse/HBASE-11334 Project: HBase Issue Type: Improvement Reporter: jay vyas Migrating to new log implementations is underway as in HBASE-10092. Next step would be to abstract them so that the hadoop community can standardize on a logging layer that is easy for end users to tune. Simplest way to do this is use SLF4j APIs as the main interface and binding/ implementation details in the docs as necessary. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11696) Make CombinedBlockCache resizable.
[ https://issues.apache.org/jira/browse/HBASE-11696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102379#comment-14102379 ] Hudson commented on HBASE-11696: SUCCESS: Integrated in HBase-TRUNK #5410 (See [https://builds.apache.org/job/HBase-TRUNK/5410/]) HBASE-11696 Make CombinedBlockCache resizable. (anoopsamjohn: rev 3c13e8f3ced049431cff1f9f2c0baa92a1ca5c24) * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CombinedBlockCache.java Make CombinedBlockCache resizable. -- Key: HBASE-11696 URL: https://issues.apache.org/jira/browse/HBASE-11696 Project: HBase Issue Type: Improvement Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 0.99.0, 2.0.0 Attachments: HBASE-11696.patch HBASE-5349 adds auto tuning of memstore heap and block cache heap. Block cache needs to be resizable in order for this. CombinedBlockCache is not marked resizable now. We can make this. On resize the L1 cache (ie. LRU cache) can get resized. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Issue Comment Deleted] (HBASE-11774) Avoid allocating unnecessary tag iterators
[ https://issues.apache.org/jira/browse/HBASE-11774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-11774: --- Comment: was deleted (was: FYI.. I will commit HBASE-11553 in some time. I am incorporating the changes from this patch to visibility classes. bq.CellUtil.tagsIterator() used without the tags length check in VisibilityUtils also. Might be good to fix there also? This part also I am fixing. You can just avoid the visibility classes from the patch.) Avoid allocating unnecessary tag iterators -- Key: HBASE-11774 URL: https://issues.apache.org/jira/browse/HBASE-11774 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11774.patch We can avoid an unnecessary object allocation, sometimes in hot code paths, by not creating a tag iterator if the cell's tag area is of length zero, signifying no tags present. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11773) Wrong field used for protobuf construction in RegionStates.
[ https://issues.apache.org/jira/browse/HBASE-11773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11773: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Pushed to 0.98+. Thanks for the patch [~octo47]! Wrong field used for protobuf construction in RegionStates. --- Key: HBASE-11773 URL: https://issues.apache.org/jira/browse/HBASE-11773 Project: HBase Issue Type: Bug Components: Region Assignment Reporter: Andrey Stepachev Assignee: Andrey Stepachev Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11773-0.98.patch, HBASE-11773.patch Protobuf Java Pojo converter uses wrong field for converted enum construction (actually default value of protobuf message used). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11657) Put HTable region methods in an interface
[ https://issues.apache.org/jira/browse/HBASE-11657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carter updated HBASE-11657: --- Status: Open (was: Patch Available) Put HTable region methods in an interface - Key: HBASE-11657 URL: https://issues.apache.org/jira/browse/HBASE-11657 Project: HBase Issue Type: Improvement Affects Versions: 0.99.0 Reporter: Carter Assignee: Carter Fix For: 0.99.0 Attachments: HBASE_11657.patch, HBASE_11657_v2.patch, HBASE_11657_v3.patch, HBASE_11657_v3.patch, HBASE_11657_v4.patch Most of the HTable methods are now abstracted by HTableInterface, with the notable exception of the following methods that pertain to region metadata: {code} HRegionLocation getRegionLocation(final String row) HRegionLocation getRegionLocation(final byte [] row) HRegionLocation getRegionLocation(final byte [] row, boolean reload) byte [][] getStartKeys() byte[][] getEndKeys() Pairbyte[][],byte[][] getStartEndKeys() void clearRegionCache() {code} and a default scope method which maybe should be bundled with the others: {code} ListRegionLocations listRegionLocations() {code} Since the consensus seems to be that these would muddy HTableInterface with non-core functionality, where should it go? MapReduce looks up the region boundaries, so it needs to be exposed somewhere. Let me throw out a straw man to start the conversation. I propose: {code} org.apache.hadoop.hbase.client.HRegionInterface {code} Have HTable implement this interface. Also add these methods to HConnection: {code} HRegionInterface getTableRegion(TableName tableName) HRegionInterface getTableRegion(TableName tableName, ExecutorService pool) {code} [~stack], [~ndimiduk], [~enis], thoughts? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11657) Put HTable region methods in an interface
[ https://issues.apache.org/jira/browse/HBASE-11657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carter updated HBASE-11657: --- Attachment: HBASE_11657_v5.patch Put HTable region methods in an interface - Key: HBASE-11657 URL: https://issues.apache.org/jira/browse/HBASE-11657 Project: HBase Issue Type: Improvement Affects Versions: 0.99.0 Reporter: Carter Assignee: Carter Fix For: 0.99.0 Attachments: HBASE_11657.patch, HBASE_11657_v2.patch, HBASE_11657_v3.patch, HBASE_11657_v3.patch, HBASE_11657_v4.patch, HBASE_11657_v5.patch Most of the HTable methods are now abstracted by HTableInterface, with the notable exception of the following methods that pertain to region metadata: {code} HRegionLocation getRegionLocation(final String row) HRegionLocation getRegionLocation(final byte [] row) HRegionLocation getRegionLocation(final byte [] row, boolean reload) byte [][] getStartKeys() byte[][] getEndKeys() Pairbyte[][],byte[][] getStartEndKeys() void clearRegionCache() {code} and a default scope method which maybe should be bundled with the others: {code} ListRegionLocations listRegionLocations() {code} Since the consensus seems to be that these would muddy HTableInterface with non-core functionality, where should it go? MapReduce looks up the region boundaries, so it needs to be exposed somewhere. Let me throw out a straw man to start the conversation. I propose: {code} org.apache.hadoop.hbase.client.HRegionInterface {code} Have HTable implement this interface. Also add these methods to HConnection: {code} HRegionInterface getTableRegion(TableName tableName) HRegionInterface getTableRegion(TableName tableName, ExecutorService pool) {code} [~stack], [~ndimiduk], [~enis], thoughts? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11657) Put HTable region methods in an interface
[ https://issues.apache.org/jira/browse/HBASE-11657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carter updated HBASE-11657: --- Status: Patch Available (was: Open) Submitted new patch in v5. Made HRL interface.public, removed {{clearRegionCache()}}, added {{TableName getName()}}. Put HTable region methods in an interface - Key: HBASE-11657 URL: https://issues.apache.org/jira/browse/HBASE-11657 Project: HBase Issue Type: Improvement Affects Versions: 0.99.0 Reporter: Carter Assignee: Carter Fix For: 0.99.0 Attachments: HBASE_11657.patch, HBASE_11657_v2.patch, HBASE_11657_v3.patch, HBASE_11657_v3.patch, HBASE_11657_v4.patch, HBASE_11657_v5.patch Most of the HTable methods are now abstracted by HTableInterface, with the notable exception of the following methods that pertain to region metadata: {code} HRegionLocation getRegionLocation(final String row) HRegionLocation getRegionLocation(final byte [] row) HRegionLocation getRegionLocation(final byte [] row, boolean reload) byte [][] getStartKeys() byte[][] getEndKeys() Pairbyte[][],byte[][] getStartEndKeys() void clearRegionCache() {code} and a default scope method which maybe should be bundled with the others: {code} ListRegionLocations listRegionLocations() {code} Since the consensus seems to be that these would muddy HTableInterface with non-core functionality, where should it go? MapReduce looks up the region boundaries, so it needs to be exposed somewhere. Let me throw out a straw man to start the conversation. I propose: {code} org.apache.hadoop.hbase.client.HRegionInterface {code} Have HTable implement this interface. Also add these methods to HConnection: {code} HRegionInterface getTableRegion(TableName tableName) HRegionInterface getTableRegion(TableName tableName, ExecutorService pool) {code} [~stack], [~ndimiduk], [~enis], thoughts? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11696) Make CombinedBlockCache resizable.
[ https://issues.apache.org/jira/browse/HBASE-11696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102439#comment-14102439 ] Hudson commented on HBASE-11696: SUCCESS: Integrated in HBase-1.0 #111 (See [https://builds.apache.org/job/HBase-1.0/111/]) HBASE-11696 Make CombinedBlockCache resizable. (anoopsamjohn: rev d502bafad2592e83672f3bbe3bae2e2fb48a19cc) * hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/CombinedBlockCache.java Make CombinedBlockCache resizable. -- Key: HBASE-11696 URL: https://issues.apache.org/jira/browse/HBASE-11696 Project: HBase Issue Type: Improvement Reporter: Anoop Sam John Assignee: Anoop Sam John Fix For: 0.99.0, 2.0.0 Attachments: HBASE-11696.patch HBASE-5349 adds auto tuning of memstore heap and block cache heap. Block cache needs to be resizable in order for this. CombinedBlockCache is not marked resizable now. We can make this. On resize the L1 cache (ie. LRU cache) can get resized. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11774) Avoid allocating unnecessary tag iterators
[ https://issues.apache.org/jira/browse/HBASE-11774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11774: --- Attachment: HBASE-11774_v2.patch bq. You can just avoid the visibility classes from the patch. Would save a bit of work maybe but I think each change should stand on its own and be complete. But yeah we will need these changes in HBASE-11553 also or that patch would regress on this point. bq. CellUtil.tagsIterator() used without the tags length check in VisibilityUtils also Attached v2 patch that includes this. Will commit shortly unless objection. Thanks for the reviews! All o.a.h.h.security.*.* tests pass locally. Avoid allocating unnecessary tag iterators -- Key: HBASE-11774 URL: https://issues.apache.org/jira/browse/HBASE-11774 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11774.patch, HBASE-11774_v2.patch We can avoid an unnecessary object allocation, sometimes in hot code paths, by not creating a tag iterator if the cell's tag area is of length zero, signifying no tags present. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11334) Migrate to SLF4J as logging interface
[ https://issues.apache.org/jira/browse/HBASE-11334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102457#comment-14102457 ] stack commented on HBASE-11334: --- bq. Next step would be to abstract them so that the hadoop community can standardize on a logging layer that is easy for end users to tune. That sounds grand given all the different logging machines afoot in hadoop. The aim is to have all use the same? What about the 3rd parties that are outside of the hadoop umbrella? e.g. jetty? HBase is up on an abstraction already, apache commons logging. Why do we need to move to another [~jayunit100]? Migrate to SLF4J as logging interface - Key: HBASE-11334 URL: https://issues.apache.org/jira/browse/HBASE-11334 Project: HBase Issue Type: Improvement Reporter: jay vyas Migrating to new log implementations is underway as in HBASE-10092. Next step would be to abstract them so that the hadoop community can standardize on a logging layer that is easy for end users to tune. Simplest way to do this is use SLF4j APIs as the main interface and binding/ implementation details in the docs as necessary. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11334) Migrate to SLF4J as logging interface
[ https://issues.apache.org/jira/browse/HBASE-11334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102468#comment-14102468 ] stack commented on HBASE-11334: --- bq. Why do we need to move to another [~jayunit100]? Smile = http://jayunit100.blogspot.com/2013/10/simplifying-distinction-between-sl4j.html Migrate to SLF4J as logging interface - Key: HBASE-11334 URL: https://issues.apache.org/jira/browse/HBASE-11334 Project: HBase Issue Type: Improvement Reporter: jay vyas Migrating to new log implementations is underway as in HBASE-10092. Next step would be to abstract them so that the hadoop community can standardize on a logging layer that is easy for end users to tune. Simplest way to do this is use SLF4j APIs as the main interface and binding/ implementation details in the docs as necessary. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11334) Migrate to SLF4J as logging interface
[ https://issues.apache.org/jira/browse/HBASE-11334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102473#comment-14102473 ] stack commented on HBASE-11334: --- [~jayunit100] What you suggest for reconciling logging engines in hbase? We bundle a bunch of third-parties -- hadoop and non-hadoop -- with conflicting loggings and then we ourselves are on the classpath of other apps/containers. Migrate to SLF4J as logging interface - Key: HBASE-11334 URL: https://issues.apache.org/jira/browse/HBASE-11334 Project: HBase Issue Type: Improvement Reporter: jay vyas Migrating to new log implementations is underway as in HBASE-10092. Next step would be to abstract them so that the hadoop community can standardize on a logging layer that is easy for end users to tune. Simplest way to do this is use SLF4j APIs as the main interface and binding/ implementation details in the docs as necessary. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11774) Avoid allocating unnecessary tag iterators
[ https://issues.apache.org/jira/browse/HBASE-11774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11774: --- Attachment: HBASE-11774_v2-0.98.patch Avoid allocating unnecessary tag iterators -- Key: HBASE-11774 URL: https://issues.apache.org/jira/browse/HBASE-11774 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11774.patch, HBASE-11774_v2-0.98.patch, HBASE-11774_v2.patch We can avoid an unnecessary object allocation, sometimes in hot code paths, by not creating a tag iterator if the cell's tag area is of length zero, signifying no tags present. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11763) Move TTL handling into ScanQueryMatcher
[ https://issues.apache.org/jira/browse/HBASE-11763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11763: --- Attachment: HBASE-11763.patch Updated patch drops a junk change in ScanDeleteTracker that would cause a javadoc warning. Move TTL handling into ScanQueryMatcher --- Key: HBASE-11763 URL: https://issues.apache.org/jira/browse/HBASE-11763 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11763.patch, HBASE-11763.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11764) Support per cell TTLs
[ https://issues.apache.org/jira/browse/HBASE-11764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102496#comment-14102496 ] Andrew Purtell commented on HBASE-11764: bq. So you will add setter in Mutation (Non Delete) to pass the per cell TTL right? Yes. Also, I realized this morning that this part of HBASE-11763 will need to be changed in the patch on this issue: {code} @@ -362,9 +360,16 @@ public class ScanQueryMatcher { } // note the following next else if... // delete marker are not subject to other delete markers -} else if (!this.deletes.isEmpty()) { - DeleteResult deleteResult = deletes.isDeleted(cell); - switch (deleteResult) { +} else { + // If the cell is expired and we have enough versions, skip + if (columns.hasMinVersions() HStore.isExpired(cell, oldestUnexpiredTS)) { +return columns.getNextRowOrNextColumn(cell.getQualifierArray(), qualifierOffset, +qualifierLength); + } + // Check deletes + if (!this.deletes.isEmpty()) { +DeleteResult deleteResult = deletes.isDeleted(cell); +switch (deleteResult) { case FAMILY_DELETED: case COLUMN_DELETED: return columns.getNextRowOrNextColumn(cell.getQualifierArray(), {code} We can't assume based on a cell TTL that we can skip to the next column. We can only skip the current cell. This may affect scanning performance unconditionally. Up to now additional costs like the tag iterator would be avoided wherever cells do not have tags. Support per cell TTLs - Key: HBASE-11764 URL: https://issues.apache.org/jira/browse/HBASE-11764 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.99.0, 0.98.6 Attachments: HBASE-11764.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11646) Handle the MOB in compaction
[ https://issues.apache.org/jira/browse/HBASE-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102497#comment-14102497 ] Jonathan Hsieh commented on HBASE-11646: note -- the patch is being reviewed here https://reviews.apache.org/r/24736/ Handle the MOB in compaction Key: HBASE-11646 URL: https://issues.apache.org/jira/browse/HBASE-11646 Project: HBase Issue Type: Sub-task Components: Compaction Reporter: Jingcheng Du Assignee: Jingcheng Du Attachments: HBASE-11646.diff In the updated MOB design however, admins can set CF level thresholds that would force cell values the threshold to use the MOB write path instead of the traditional path. There are two cases where mobs need to interact with this threshold 1) How do we handle the case when the threshold size is changed? 2) Today, you can bulkload hfiles that contain MOBs. These cells will work as normal inside hbase. Unfortunately the cells with MOBs in them will never benefit form the MOB write path. The proposal here is to modify compaction in mob enabled cf's such that the threshold value is honored with compactions. This handles case #1 -- elements that should be moved out of the normal hfiles get 'compacted' into refs and mob hfiles, and values that should be pulled into the cf get derefed and written out wholy in the compaction. For case #2, we can maintain the same behavior and compaction would move data into the mob writepath/lifecycle. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11742) Backport HBASE-7987 and HBASE-11185 to 0.98
[ https://issues.apache.org/jira/browse/HBASE-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102502#comment-14102502 ] Esteban Gutierrez commented on HBASE-11742: --- Thanks [~mbertozzi]. [~apurtell] does it looks good for you? Backport HBASE-7987 and HBASE-11185 to 0.98 --- Key: HBASE-11742 URL: https://issues.apache.org/jira/browse/HBASE-11742 Project: HBase Issue Type: Improvement Components: mapreduce, snapshots Affects Versions: 0.98.5 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Fix For: 0.98.6 Attachments: HBASE-11742.v0.patch, HBASE-11742.v1.patch HBASE-7987 improves how snapshots are handled via a manifest file. This requires reverting HBASE-11360 since introduces an alternate functionality that is not compatible with HBASE-7987. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HBASE-11764) Support per cell TTLs
[ https://issues.apache.org/jira/browse/HBASE-11764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102496#comment-14102496 ] Andrew Purtell edited comment on HBASE-11764 at 8/19/14 5:30 PM: - bq. So you will add setter in Mutation (Non Delete) to pass the per cell TTL right? Yes. Also, I realized this morning that this part of HBASE-11763 will need to be changed in the patch on this issue: {code} @@ -362,9 +360,16 @@ public class ScanQueryMatcher { } // note the following next else if... // delete marker are not subject to other delete markers -} else if (!this.deletes.isEmpty()) { - DeleteResult deleteResult = deletes.isDeleted(cell); - switch (deleteResult) { +} else { + // If the cell is expired and we have enough versions, skip + if (columns.hasMinVersions() HStore.isExpired(cell, oldestUnexpiredTS)) { +return columns.getNextRowOrNextColumn(cell.getQualifierArray(), qualifierOffset, +qualifierLength); + } + // Check deletes + if (!this.deletes.isEmpty()) { +DeleteResult deleteResult = deletes.isDeleted(cell); +switch (deleteResult) { case FAMILY_DELETED: case COLUMN_DELETED: return columns.getNextRowOrNextColumn(cell.getQualifierArray(), {code} HStore#isExpired has been changed to do, if a cell TTL is available, a comparison of the cell's timestamp with its TTL tag, ignoring the family setting (oldestUnexpiredTS). We can't assume based on a cell TTL that we can skip to the next column. We can only skip the current cell, because a cell TTL overrides any family setting in the narrowest scope of a single cell. The earlier assumption that once we hit an expired cell no earlier cell is alive is no longer true. This may affect scanning performance unconditionally. Up to now additional costs like the tag iterator would be avoided wherever cells do not have tags. was (Author: apurtell): bq. So you will add setter in Mutation (Non Delete) to pass the per cell TTL right? Yes. Also, I realized this morning that this part of HBASE-11763 will need to be changed in the patch on this issue: {code} @@ -362,9 +360,16 @@ public class ScanQueryMatcher { } // note the following next else if... // delete marker are not subject to other delete markers -} else if (!this.deletes.isEmpty()) { - DeleteResult deleteResult = deletes.isDeleted(cell); - switch (deleteResult) { +} else { + // If the cell is expired and we have enough versions, skip + if (columns.hasMinVersions() HStore.isExpired(cell, oldestUnexpiredTS)) { +return columns.getNextRowOrNextColumn(cell.getQualifierArray(), qualifierOffset, +qualifierLength); + } + // Check deletes + if (!this.deletes.isEmpty()) { +DeleteResult deleteResult = deletes.isDeleted(cell); +switch (deleteResult) { case FAMILY_DELETED: case COLUMN_DELETED: return columns.getNextRowOrNextColumn(cell.getQualifierArray(), {code} We can't assume based on a cell TTL that we can skip to the next column. We can only skip the current cell. This may affect scanning performance unconditionally. Up to now additional costs like the tag iterator would be avoided wherever cells do not have tags. Support per cell TTLs - Key: HBASE-11764 URL: https://issues.apache.org/jira/browse/HBASE-11764 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.99.0, 0.98.6 Attachments: HBASE-11764.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11764) Support per cell TTLs
[ https://issues.apache.org/jira/browse/HBASE-11764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102508#comment-14102508 ] Andrew Purtell commented on HBASE-11764: Anyway, the above is easy to deal with, I just have to break out the expiration tests into one check for cell TTL and another for family and skip or move to next column based on each test individually. Just calling your attention to the issue with the current patch Support per cell TTLs - Key: HBASE-11764 URL: https://issues.apache.org/jira/browse/HBASE-11764 Project: HBase Issue Type: Sub-task Reporter: Andrew Purtell Assignee: Andrew Purtell Fix For: 0.99.0, 0.98.6 Attachments: HBASE-11764.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-4955) Use the official versions of surefire junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102516#comment-14102516 ] stack commented on HBASE-4955: -- Its a 'foriegn' test, one that came in from hadoop when we copy/pasted http. Its second class. Could comment it out if only failing test (as per @nkeywal -- sort of) Use the official versions of surefire junit - Key: HBASE-4955 URL: https://issues.apache.org/jira/browse/HBASE-4955 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0, 0.98.0, 0.96.0, 0.99.0 Environment: all Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Critical Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 4955.v2.patch, 4955.v2.patch, 4955.v3.patch, 4955.v3.patch, 4955.v3.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v5.patch, 4955.v6.patch, 4955.v7.patch, 4955.v7.patch, 4955.v8.patch, 4955.v9.patch, 8204.v4.patch We currently use private versions for Surefire JUnit since HBASE-4763. This JIRA traks what we need to move to official versions. Surefire 2.11 is just out, but, after some tests, it does not contain all what we need. JUnit. Could be for JUnit 4.11. Issue to monitor: https://github.com/KentBeck/junit/issues/359: fixed in our version, no feedback for an integration on trunk Surefire: Could be for Surefire 2.12. Issues to monitor are: 329 (category support): fixed, we use the official implementation from the trunk 786 (@Category with forkMode=always): fixed, we use the official implementation from the trunk 791 (incorrect elapsed time on test failure): fixed, we use the official implementation from the trunk 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on our version. 760 (does not take into account the test method): fixed in trunk, not fixed in our version 798 (print immediately the test class name): not fixed in trunk, not fixed in our version 799 (Allow test parallelization when forkMode=always): not fixed in trunk, not fixed in our version 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, fixed on our version 800 793 are the more important to monitor, it's the only ones that are fixed in our version but not on trunk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11772) Bulk load mvcc and seqId issues with native hfiles
[ https://issues.apache.org/jira/browse/HBASE-11772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102519#comment-14102519 ] Jerry He commented on HBASE-11772: -- Hi, [~jmspaggi] Will do. Bulk load mvcc and seqId issues with native hfiles -- Key: HBASE-11772 URL: https://issues.apache.org/jira/browse/HBASE-11772 Project: HBase Issue Type: Bug Affects Versions: 0.98.5 Reporter: Jerry He Assignee: Jerry He Priority: Critical Fix For: 0.98.6 Attachments: HBASE-11772-0.98.patch There are mvcc and seqId issues when bulk load native hfiles -- meaning hfiles that are direct file copy-out from hbase, not from HFileOutputFormat job. There are differences between these two types of hfiles. Native hfiles have possible non-zero MAX_MEMSTORE_TS_KEY value and non-zero mvcc values in cells. Native hfiles also have MAX_SEQ_ID_KEY. Native hfiles do not have BULKLOAD_TIME_KEY. Here are a couple of problems I observed when bulk load native hfiles. 1. Cells in newly bulk loaded hfiles can be invisible to scan. It is easy to re-create. Bulk load a native hfile that has a larger mvcc value in cells, e.g 10 If the current readpoint when initiating a scan is less than 10, the cells in the new hfile are skipped, thus become invisible. We don't reset the readpoint of a region after bulk load. 2. The current StoreFile.isBulkLoadResult() is implemented as: {code} return metadataMap.containsKey(BULKLOAD_TIME_KEY) {code} which does not detect bulkloaded native hfiles. 3. Another observed problem is possible data loss during log recovery. It is similar to HBASE-10958 reported by [~jdcryans]. Borrow the re-create steps from HBASE-10958. 1) Create an empty table 2) Put one row in it (let's say it gets seqid 1) 3) Bulk load one native hfile with large seqId ( e.g. 100). The native hfile can be obtained by copying out from existing table. 4) Kill the region server that holds the table's region. Scan the table once the region is made available again. The first row, at seqid 1, will be missing since the HFile with seqid 100 makes us believe that everything that came before it was flushed. The problem 3 is probably related to 2. We will be ok if we get the appended seqId during bulk load instead of 100 from inside the file. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11591) Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file
[ https://issues.apache.org/jira/browse/HBASE-11591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102535#comment-14102535 ] stack commented on HBASE-11591: --- There is the SequenceNumber Interface but that is only about getting a SequenceNumber. As per you fellows, don't think we need add method to Cell. There are no setters in Cell currently. Why start now. A marker Interface that allows you set sequence id on the hosting object seems fine. MutableCell is a little ugly since it tarnishes our nice 'Cell' notion. What about adding setter on SequenceNumber? One of the implementors is HLogKey. It has a: void setLogSeqNum(final long sequence) { this.logSeqNum = sequence; this.seqNumAssignedLatch.countDown(); } Scanner fails to retrieve KV from bulk loaded file with highest sequence id than the cell's mvcc in a non-bulk loaded file --- Key: HBASE-11591 URL: https://issues.apache.org/jira/browse/HBASE-11591 Project: HBase Issue Type: Bug Affects Versions: 0.99.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Priority: Critical Fix For: 0.99.0 Attachments: HBASE-11591.patch, HBASE-11591_1.patch, HBASE-11591_2.patch, HBASE-11591_3.patch, TestBulkload.java See discussion in HBASE-11339. When we have a case where there are same KVs in two files one produced by flush/compaction and the other thro the bulk load. Both the files have some same kvs which matches even in timestamp. Steps: Add some rows with a specific timestamp and flush the same. Bulk load a file with the same data.. Enusre that assign seqnum property is set. The bulk load should use HFileOutputFormat2 (or ensure that we write the bulk_time_output key). This would ensure that the bulk loaded file has the highest seq num. Assume the cell in the flushed/compacted store file is row1,cf,cq,ts1, value1 and the cell in the bulk loaded file is row1,cf,cq,ts1,value2 (There are no parallel scans). Issue a scan on the table in 0.96. The retrieved value is row1,cf1,cq,ts1,value2 But the same in 0.98 will retrieve row1,cf1,cq,ts2,value1. This is a behaviour change. This is because of this code {code} public int compare(KeyValueScanner left, KeyValueScanner right) { int comparison = compare(left.peek(), right.peek()); if (comparison != 0) { return comparison; } else { // Since both the keys are exactly the same, we break the tie in favor // of the key which came latest. long leftSequenceID = left.getSequenceID(); long rightSequenceID = right.getSequenceID(); if (leftSequenceID rightSequenceID) { return -1; } else if (leftSequenceID rightSequenceID) { return 1; } else { return 0; } } } {code} Here in 0.96 case the mvcc of the cell in both the files will have 0 and so the comparison will happen from the else condition . Where the seq id of the bulk loaded file is greater and would sort out first ensuring that the scan happens from that bulk loaded file. In case of 0.98+ as we are retaining the mvcc+seqid we are not making the mvcc as 0 (remains a non zero positive value). Hence the compare() sorts out the cell in the flushed/compacted file. Which means though we know the lateset file is the bulk loaded file we don't scan the data. Seems to be a behaviour change. Will check on other corner cases also but we are trying to know the behaviour of bulk load because we are evaluating if it can be used for MOB design. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HBASE-11574) hbase:meta's regions can be replicated
[ https://issues.apache.org/jira/browse/HBASE-11574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das reassigned HBASE-11574: --- Assignee: Devaraj Das hbase:meta's regions can be replicated -- Key: HBASE-11574 URL: https://issues.apache.org/jira/browse/HBASE-11574 Project: HBase Issue Type: Sub-task Reporter: Enis Soztutar Assignee: Devaraj Das As mentioned elsewhere, we can leverage hbase-10070 features to create replicas for the meta tables regions so that: 1. meta hotspotting can be circumvented 2. meta becomes highly available for reading -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102555#comment-14102555 ] Francis Liu commented on HBASE-11165: - {quote} Can I have some pointers on how to read the above. Zk-less AM is better because you scan a table – you don't have to ls znodes? What is the 1M znodes vs 1M rows about in above? {quote} Essentially the apis are better. ie 1M rows we can iterate over the rows instead of ls and get back a huge chunk of data. ie deleting 1M znodes takes too long, this could be parallelizable against an hbase table. For 2.a, response is below. For 2.b, it's mainly a concern wether we'll hit other ZK issues when having that many child znodes (1M and beyond). HDFS guys are already looking into scaling number of child directories for NN. Will update doc. {quote} Francis Liu Is the above the basis for your ...As our experiments shows splitting is a must for scaling.? If split meta, then more read/write throughput? {quote} If split meta, then: 1) Less write amplification (ie no large compactions), Better W throughput. 2) More disks, more R/W throughput. 3. More heap to fit meta, better R throughput. {quote} Because the meta table could be served by many machines so field more reads/writes? The reads/writes are needed at starttime or during cluster lifetime in your judgement? Thanks. {quote} Yep needed for startup. We need to do experiments for 1 rack and 2 rack failure for cluster lifetime case. Though large compactions would creep up on you. So splitting would still be motivating for cluster lifetime IMHO. Scaling so cluster can host 1M regions and beyond (50M regions?) Key: HBASE-11165 URL: https://issues.apache.org/jira/browse/HBASE-11165 Project: HBase Issue Type: Brainstorming Reporter: stack Attachments: HBASE-11165.zip, Region Scalability test.pdf, zk_less_assignment_comparison_2.pdf This discussion issue comes out of Co-locate Meta And Master HBASE-10569 and comments on the doc posted there. A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M regions maybe even 50M later. This issue is about discussing how we will do that (or if not 50M on a cluster, how otherwise we can attain same end). More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11657) Put HTable region methods in an interface
[ https://issues.apache.org/jira/browse/HBASE-11657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102569#comment-14102569 ] stack commented on HBASE-11657: --- bq. For RegionLocations, we may need an immutable version. Sounds good. Looking at patch, would I ever have to clean up a RegionLocator... call close on it when done? Otherwise +1 Put HTable region methods in an interface - Key: HBASE-11657 URL: https://issues.apache.org/jira/browse/HBASE-11657 Project: HBase Issue Type: Improvement Affects Versions: 0.99.0 Reporter: Carter Assignee: Carter Fix For: 0.99.0 Attachments: HBASE_11657.patch, HBASE_11657_v2.patch, HBASE_11657_v3.patch, HBASE_11657_v3.patch, HBASE_11657_v4.patch, HBASE_11657_v5.patch Most of the HTable methods are now abstracted by HTableInterface, with the notable exception of the following methods that pertain to region metadata: {code} HRegionLocation getRegionLocation(final String row) HRegionLocation getRegionLocation(final byte [] row) HRegionLocation getRegionLocation(final byte [] row, boolean reload) byte [][] getStartKeys() byte[][] getEndKeys() Pairbyte[][],byte[][] getStartEndKeys() void clearRegionCache() {code} and a default scope method which maybe should be bundled with the others: {code} ListRegionLocations listRegionLocations() {code} Since the consensus seems to be that these would muddy HTableInterface with non-core functionality, where should it go? MapReduce looks up the region boundaries, so it needs to be exposed somewhere. Let me throw out a straw man to start the conversation. I propose: {code} org.apache.hadoop.hbase.client.HRegionInterface {code} Have HTable implement this interface. Also add these methods to HConnection: {code} HRegionInterface getTableRegion(TableName tableName) HRegionInterface getTableRegion(TableName tableName, ExecutorService pool) {code} [~stack], [~ndimiduk], [~enis], thoughts? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11773) Wrong field used for protobuf construction in RegionStates.
[ https://issues.apache.org/jira/browse/HBASE-11773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102581#comment-14102581 ] Hudson commented on HBASE-11773: FAILURE: Integrated in HBase-1.0 #112 (See [https://builds.apache.org/job/HBase-1.0/112/]) HBASE-11773 Wrong field used for protobuf construction in RegionStates (Andrey Stepachev) (apurtell: rev 4901e649b64700e6796c2ba2da24ac2b906273ec) * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestRegionState.java * hbase-client/src/main/java/org/apache/hadoop/hbase/master/RegionState.java Wrong field used for protobuf construction in RegionStates. --- Key: HBASE-11773 URL: https://issues.apache.org/jira/browse/HBASE-11773 Project: HBase Issue Type: Bug Components: Region Assignment Reporter: Andrey Stepachev Assignee: Andrey Stepachev Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11773-0.98.patch, HBASE-11773.patch Protobuf Java Pojo converter uses wrong field for converted enum construction (actually default value of protobuf message used). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11657) Put HTable region methods in an interface
[ https://issues.apache.org/jira/browse/HBASE-11657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carter updated HBASE-11657: --- Status: Open (was: Patch Available) Put HTable region methods in an interface - Key: HBASE-11657 URL: https://issues.apache.org/jira/browse/HBASE-11657 Project: HBase Issue Type: Improvement Affects Versions: 0.99.0 Reporter: Carter Assignee: Carter Fix For: 0.99.0 Attachments: HBASE_11657.patch, HBASE_11657_v2.patch, HBASE_11657_v3.patch, HBASE_11657_v3.patch, HBASE_11657_v4.patch, HBASE_11657_v5.patch, HBASE_11657_v6.patch Most of the HTable methods are now abstracted by HTableInterface, with the notable exception of the following methods that pertain to region metadata: {code} HRegionLocation getRegionLocation(final String row) HRegionLocation getRegionLocation(final byte [] row) HRegionLocation getRegionLocation(final byte [] row, boolean reload) byte [][] getStartKeys() byte[][] getEndKeys() Pairbyte[][],byte[][] getStartEndKeys() void clearRegionCache() {code} and a default scope method which maybe should be bundled with the others: {code} ListRegionLocations listRegionLocations() {code} Since the consensus seems to be that these would muddy HTableInterface with non-core functionality, where should it go? MapReduce looks up the region boundaries, so it needs to be exposed somewhere. Let me throw out a straw man to start the conversation. I propose: {code} org.apache.hadoop.hbase.client.HRegionInterface {code} Have HTable implement this interface. Also add these methods to HConnection: {code} HRegionInterface getTableRegion(TableName tableName) HRegionInterface getTableRegion(TableName tableName, ExecutorService pool) {code} [~stack], [~ndimiduk], [~enis], thoughts? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11657) Put HTable region methods in an interface
[ https://issues.apache.org/jira/browse/HBASE-11657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carter updated HBASE-11657: --- Attachment: HBASE_11657_v6.patch Put HTable region methods in an interface - Key: HBASE-11657 URL: https://issues.apache.org/jira/browse/HBASE-11657 Project: HBase Issue Type: Improvement Affects Versions: 0.99.0 Reporter: Carter Assignee: Carter Fix For: 0.99.0 Attachments: HBASE_11657.patch, HBASE_11657_v2.patch, HBASE_11657_v3.patch, HBASE_11657_v3.patch, HBASE_11657_v4.patch, HBASE_11657_v5.patch, HBASE_11657_v6.patch Most of the HTable methods are now abstracted by HTableInterface, with the notable exception of the following methods that pertain to region metadata: {code} HRegionLocation getRegionLocation(final String row) HRegionLocation getRegionLocation(final byte [] row) HRegionLocation getRegionLocation(final byte [] row, boolean reload) byte [][] getStartKeys() byte[][] getEndKeys() Pairbyte[][],byte[][] getStartEndKeys() void clearRegionCache() {code} and a default scope method which maybe should be bundled with the others: {code} ListRegionLocations listRegionLocations() {code} Since the consensus seems to be that these would muddy HTableInterface with non-core functionality, where should it go? MapReduce looks up the region boundaries, so it needs to be exposed somewhere. Let me throw out a straw man to start the conversation. I propose: {code} org.apache.hadoop.hbase.client.HRegionInterface {code} Have HTable implement this interface. Also add these methods to HConnection: {code} HRegionInterface getTableRegion(TableName tableName) HRegionInterface getTableRegion(TableName tableName, ExecutorService pool) {code} [~stack], [~ndimiduk], [~enis], thoughts? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11657) Put HTable region methods in an interface
[ https://issues.apache.org/jira/browse/HBASE-11657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carter updated HBASE-11657: --- Status: Patch Available (was: Open) Of course. Added extends Closeable to interface. Put HTable region methods in an interface - Key: HBASE-11657 URL: https://issues.apache.org/jira/browse/HBASE-11657 Project: HBase Issue Type: Improvement Affects Versions: 0.99.0 Reporter: Carter Assignee: Carter Fix For: 0.99.0 Attachments: HBASE_11657.patch, HBASE_11657_v2.patch, HBASE_11657_v3.patch, HBASE_11657_v3.patch, HBASE_11657_v4.patch, HBASE_11657_v5.patch, HBASE_11657_v6.patch Most of the HTable methods are now abstracted by HTableInterface, with the notable exception of the following methods that pertain to region metadata: {code} HRegionLocation getRegionLocation(final String row) HRegionLocation getRegionLocation(final byte [] row) HRegionLocation getRegionLocation(final byte [] row, boolean reload) byte [][] getStartKeys() byte[][] getEndKeys() Pairbyte[][],byte[][] getStartEndKeys() void clearRegionCache() {code} and a default scope method which maybe should be bundled with the others: {code} ListRegionLocations listRegionLocations() {code} Since the consensus seems to be that these would muddy HTableInterface with non-core functionality, where should it go? MapReduce looks up the region boundaries, so it needs to be exposed somewhere. Let me throw out a straw man to start the conversation. I propose: {code} org.apache.hadoop.hbase.client.HRegionInterface {code} Have HTable implement this interface. Also add these methods to HConnection: {code} HRegionInterface getTableRegion(TableName tableName) HRegionInterface getTableRegion(TableName tableName, ExecutorService pool) {code} [~stack], [~ndimiduk], [~enis], thoughts? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11625) Reading datablock throws Invalid HFile block magic and can not switch to hdfs checksum
[ https://issues.apache.org/jira/browse/HBASE-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102583#comment-14102583 ] Paul Fleetwood commented on HBASE-11625: I may have a reproduction of this issue. I've generated a bulk loadable HFile (which I've attached) using the HFileOutputFormat, and am unable to perform scans on it. Here is what I do: Running 0.98.5 on Mac in single instance mode... - Use the hbase shell to create a table - Use the bulkload tool to load the attached file into the table: ./bin/hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles - Use the hbase shell to count the rows in the new table - see an exception - Scan the table completely in the hbase shell - Attempt to count the table again, see it succeed Something about the scan fixes things. The exception that I see when running the count is this: RROR: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 2 But the nextCallSeq got from client: 1; request=scanner_id: 547 number_of_rows: 10 close_scanner: false next_call_seq: 1 at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3110) at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2026) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94) at java.lang.Thread.run(Thread.java:695) But, I've learned that this is a red herring. This issue is the result of a client retry (causing the sequence numbers to mismatch). The retry itself is caused by another failure, which looks like this (please note the following callstacks were captured while running my own application, but I believe they are the same in the count): java.io.IOException: Could not iterate StoreFileScanner[HFileScanner for reader reader=file:/var/folders/bn/qpypwv8s3r7g3ksgxdj0hlw8gn/T/hbase-paulfleetwood/hbase/data/paul2_lxlv1_prod/events/d2697b9d34be632e481ab33433a28699/common/69823512bede4392be352761adc669e6_SeqId_26_, compression=none, cacheConf=CacheConfig:enabled [cacheDataOnRead=true] [cacheDataOnWrite=false] [cacheIndexesOnWrite=false] [cacheBloomsOnWrite=false] [cacheEvictOnClose=false] [cacheCompressed=false][prefetchOnOpen=false], firstKey=\x00Sa\xC8Dw\xE3\xE8i\x9C\xD2\xDB\x1C\xC3Mk\xF4\x99!1\xD1\xBF\x99w/common:entity\x00id\x00\x03/1398917188791/Put, lastKey=\x0FS\x88\xAD\xE1\xFA\x11:\x85\x88\xDE\x12\x12\xF0\xD8s\x9A\x06X\x1B\x84\x1A\x8B\xA7\xC1/common:txn\x00timeoutPolicy\x00\x03/1401466337205/Put, avgKeyLen=58, avgValueLen=11, entries=345444, length=27516812, cur=\x00Sc'M\x5C\xC8\x0A\xD5\xC0P\xA53U\x01,\xDF=\x8D\x0F\xA6\x00\xCC\xCB/common:entity\x00type\x00\x03/1399007053829/Put/vlen=4/mvcc=0] This is caused by something like the following: java.io.IOException: Failed to read compressed block at 65621, onDiskSizeWithoutHeader=65620, preReadHeaderSize=33, header.length=33, header bytes: \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 Which, is caused by: java.io.IOException: Invalid HFile block magic: FJ\xA8Yt\x04@$ That last exception is thrown with the following call stack: BlockType.parse(byte[], int, int) line: 154 BlockType.read(ByteBuffer) line: 165 HFileBlock.init(ByteBuffer, boolean) line: 239 HFileBlock$FSReaderV2.readBlockDataInternal(FSDataInputStream, long, long, int, boolean, boolean) line: 1446 HFileBlock$FSReaderV2.readBlockData(long, long, int, boolean) line: 1312 HFileReaderV2.readBlock(long, long, boolean, boolean, boolean, boolean, BlockType) line: 387 HFileReaderV2$ScannerV2(HFileReaderV2$AbstractScannerV2).readNextDataBlock() line: 635 HFileReaderV2$ScannerV2.next() line: 749 StoreFileScanner.next() line: 136 KeyValueHeap.next() line: 108 StoreScanner.next(ListCell, int) line: 537 KeyValueHeap.next(ListCell, int) line: 140 HRegion$RegionScannerImpl.populateResult(ListCell, KeyValueHeap, int, byte[], int, short) line: 3937 HRegion$RegionScannerImpl.nextInternal(ListCell, int) line: 4017 HRegion$RegionScannerImpl.nextRaw(ListCell, int) line: 3885 HRegion$RegionScannerImpl.nextRaw(ListCell) line: 3876 HRegionServer.scan(RpcController, ClientProtos$ScanRequest) line: 3158 ClientProtos$ClientService$2.callBlockingMethod(Descriptors$MethodDescriptor, RpcController, Message) line: 29587 RpcServer.call(BlockingService, MethodDescriptor, Message, CellScanner, long, MonitoredRPCHandler) line: 2026 CallRunner.run() line: 98
[jira] [Updated] (HBASE-11625) Reading datablock throws Invalid HFile block magic and can not switch to hdfs checksum
[ https://issues.apache.org/jira/browse/HBASE-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Fleetwood updated HBASE-11625: --- Affects Version/s: 0.98.5 Reading datablock throws Invalid HFile block magic and can not switch to hdfs checksum - Key: HBASE-11625 URL: https://issues.apache.org/jira/browse/HBASE-11625 Project: HBase Issue Type: Bug Components: HFile Affects Versions: 0.94.21, 0.98.4, 0.98.5 Reporter: qian wang Attachments: 2711de1fdf73419d9f8afc6a8b86ce64.gz when using hbase checksum,call readBlockDataInternal() in hfileblock.java, it could happen file corruption but it only can switch to hdfs checksum inputstream till validateBlockChecksum(). If the datablock's header corrupted when b = new HFileBlock(),it throws the exception Invalid HFile block magic and the rpc call fail -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11625) Reading datablock throws Invalid HFile block magic and can not switch to hdfs checksum
[ https://issues.apache.org/jira/browse/HBASE-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Fleetwood updated HBASE-11625: --- Attachment: 2711de1fdf73419d9f8afc6a8b86ce64.gz I gzip'ed the file so that it would be under the upload size constraint. Reading datablock throws Invalid HFile block magic and can not switch to hdfs checksum - Key: HBASE-11625 URL: https://issues.apache.org/jira/browse/HBASE-11625 Project: HBase Issue Type: Bug Components: HFile Affects Versions: 0.94.21, 0.98.4, 0.98.5 Reporter: qian wang Attachments: 2711de1fdf73419d9f8afc6a8b86ce64.gz when using hbase checksum,call readBlockDataInternal() in hfileblock.java, it could happen file corruption but it only can switch to hdfs checksum inputstream till validateBlockChecksum(). If the datablock's header corrupted when b = new HFileBlock(),it throws the exception Invalid HFile block magic and the rpc call fail -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10378) Divide HLog interface into User and Implementor specific interfaces
[ https://issues.apache.org/jira/browse/HBASE-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102596#comment-14102596 ] Sean Busbey commented on HBASE-10378: - This has turned into a bit of a time sink. Now that I have small tests passing, I'm posting some WIP to get suggestions early about parts that are worth breaking off into different tickets. Note that the RB patch is definitely not something that's ready to go in. * [github branch|https://github.com/busbey/hbase/tree/HBASE-10378] - has both the original work from this ticket rebased onto master and then follow on changes. * [reviewboard for the net changes on top of master|https://reviews.apache.org/r/24857/] (warning: it's 10 pages) RB description has the major compatibility goals and current failings. The github WIP commit message has questions that I'm trying to work out in my head. I'd love to get feedback on the compatibility stuff and suggestions on the open questions. Divide HLog interface into User and Implementor specific interfaces --- Key: HBASE-10378 URL: https://issues.apache.org/jira/browse/HBASE-10378 Project: HBase Issue Type: Sub-task Components: wal Reporter: Himanshu Vashishtha Assignee: Sean Busbey Attachments: 10378-1.patch, 10378-2.patch HBASE-5937 introduces the HLog interface as a first step to support multiple WAL implementations. This interface is a good start, but has some limitations/drawbacks in its current state, such as: 1) There is no clear distinction b/w User and Implementor APIs, and it provides APIs both for WAL users (append, sync, etc) and also WAL implementors (Reader/Writer interfaces, etc). There are APIs which are very much implementation specific (getFileNum, etc) and a user such as a RegionServer shouldn't know about it. 2) There are about 14 methods in FSHLog which are not present in HLog interface but are used at several places in the unit test code. These tests typecast HLog to FSHLog, which makes it very difficult to test multiple WAL implementations without doing some ugly checks. I'd like to propose some changes in HLog interface that would ease the multi WAL story: 1) Have two interfaces WAL and WALService. WAL provides APIs for implementors. WALService provides APIs for users (such as RegionServer). 2) A skeleton implementation of the above two interface as the base class for other WAL implementations (AbstractWAL). It provides required fields for all subclasses (fs, conf, log dir, etc). Make a minimal set of test only methods and add this set in AbstractWAL. 3) HLogFactory returns a WALService reference when creating a WAL instance; if a user need to access impl specific APIs (there are unit tests which get WAL from a HRegionServer and then call impl specific APIs), use AbstractWAL type casting, 4) Make TestHLog abstract and let all implementors provide their respective test class which extends TestHLog (TestFSHLog, for example). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11657) Put HTable region methods in an interface
[ https://issues.apache.org/jira/browse/HBASE-11657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102613#comment-14102613 ] stack commented on HBASE-11657: --- +1 on v6 (Waiting on Pope @enis to give his blessing before commit) Put HTable region methods in an interface - Key: HBASE-11657 URL: https://issues.apache.org/jira/browse/HBASE-11657 Project: HBase Issue Type: Improvement Affects Versions: 0.99.0 Reporter: Carter Assignee: Carter Fix For: 0.99.0 Attachments: HBASE_11657.patch, HBASE_11657_v2.patch, HBASE_11657_v3.patch, HBASE_11657_v3.patch, HBASE_11657_v4.patch, HBASE_11657_v5.patch, HBASE_11657_v6.patch Most of the HTable methods are now abstracted by HTableInterface, with the notable exception of the following methods that pertain to region metadata: {code} HRegionLocation getRegionLocation(final String row) HRegionLocation getRegionLocation(final byte [] row) HRegionLocation getRegionLocation(final byte [] row, boolean reload) byte [][] getStartKeys() byte[][] getEndKeys() Pairbyte[][],byte[][] getStartEndKeys() void clearRegionCache() {code} and a default scope method which maybe should be bundled with the others: {code} ListRegionLocations listRegionLocations() {code} Since the consensus seems to be that these would muddy HTableInterface with non-core functionality, where should it go? MapReduce looks up the region boundaries, so it needs to be exposed somewhere. Let me throw out a straw man to start the conversation. I propose: {code} org.apache.hadoop.hbase.client.HRegionInterface {code} Have HTable implement this interface. Also add these methods to HConnection: {code} HRegionInterface getTableRegion(TableName tableName) HRegionInterface getTableRegion(TableName tableName, ExecutorService pool) {code} [~stack], [~ndimiduk], [~enis], thoughts? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11773) Wrong field used for protobuf construction in RegionStates.
[ https://issues.apache.org/jira/browse/HBASE-11773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102680#comment-14102680 ] Hudson commented on HBASE-11773: SUCCESS: Integrated in HBase-0.98 #458 (See [https://builds.apache.org/job/HBase-0.98/458/]) HBASE-11773 Wrong field used for protobuf construction in RegionStates (Andrey Stepachev) (apurtell: rev dbda5c38feb28aef2ee3829264cbe39af54c958d) * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestRegionState.java * hbase-client/src/main/java/org/apache/hadoop/hbase/master/RegionState.java Wrong field used for protobuf construction in RegionStates. --- Key: HBASE-11773 URL: https://issues.apache.org/jira/browse/HBASE-11773 Project: HBase Issue Type: Bug Components: Region Assignment Reporter: Andrey Stepachev Assignee: Andrey Stepachev Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11773-0.98.patch, HBASE-11773.patch Protobuf Java Pojo converter uses wrong field for converted enum construction (actually default value of protobuf message used). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10092) Move up on to log4j2
[ https://issues.apache.org/jira/browse/HBASE-10092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102712#comment-14102712 ] Andrew Purtell commented on HBASE-10092: bq. I don't think properties is a blocker one way or another. See Stack's comment above. The change will not be palatable for current or anticipated releases (0.98, 1.0) without properties file compatibility. A trunk only change would still be a great contribution but with less impact. I would really like to see async logging a possibility in 0.98 so will need to do this work at least in that branch. bq. Any reason why we should not consider logback? It looks like supporting unit testing with the new version of surefire is going to be very hard with log4j2 Hadoop is looking at moving up to log4j2 also. What kind of hell will we be in if Hadoop is on log4j2 and we are on logback? Isn't log4j2 the continuation of logback? Move up on to log4j2 Key: HBASE-10092 URL: https://issues.apache.org/jira/browse/HBASE-10092 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: Alex Newman Fix For: 2.0.0 Attachments: 10092.txt, 10092v2.txt, HBASE-10092.patch Allows logging with less friction. See http://logging.apache.org/log4j/2.x/ This rather radical transition can be done w/ minor change given they have an adapter for apache's logging, the one we use. They also have and adapter for slf4j so we likely can remove at least some of the 4 versions of this module our dependencies make use of. I made a start in attached patch but am currently stuck in maven dependency resolve hell courtesy of our slf4j. Fixing will take some concentration and a good net connection, an item I currently lack. Other TODOs are that will need to fix our little log level setting jsp page -- will likely have to undo our use of hadoop's tool here -- and the config system changes a little. I will return to this project soon. Will bring numbers. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10092) Move up on to log4j2
[ https://issues.apache.org/jira/browse/HBASE-10092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102713#comment-14102713 ] Andrew Purtell commented on HBASE-10092: bq. If support for properties files is that important, vote for LOG4J2-635. As requested, [~jvz]! Move up on to log4j2 Key: HBASE-10092 URL: https://issues.apache.org/jira/browse/HBASE-10092 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: Alex Newman Fix For: 2.0.0 Attachments: 10092.txt, 10092v2.txt, HBASE-10092.patch Allows logging with less friction. See http://logging.apache.org/log4j/2.x/ This rather radical transition can be done w/ minor change given they have an adapter for apache's logging, the one we use. They also have and adapter for slf4j so we likely can remove at least some of the 4 versions of this module our dependencies make use of. I made a start in attached patch but am currently stuck in maven dependency resolve hell courtesy of our slf4j. Fixing will take some concentration and a good net connection, an item I currently lack. Other TODOs are that will need to fix our little log level setting jsp page -- will likely have to undo our use of hadoop's tool here -- and the config system changes a little. I will return to this project soon. Will bring numbers. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11774) Avoid allocating unnecessary tag iterators
[ https://issues.apache.org/jira/browse/HBASE-11774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-11774: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Pushed v2 patch to 0.98+. Thanks for the reviews! Avoid allocating unnecessary tag iterators -- Key: HBASE-11774 URL: https://issues.apache.org/jira/browse/HBASE-11774 Project: HBase Issue Type: Improvement Reporter: Andrew Purtell Assignee: Andrew Purtell Priority: Minor Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11774.patch, HBASE-11774_v2-0.98.patch, HBASE-11774_v2.patch We can avoid an unnecessary object allocation, sometimes in hot code paths, by not creating a tag iterator if the cell's tag area is of length zero, signifying no tags present. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11773) Wrong field used for protobuf construction in RegionStates.
[ https://issues.apache.org/jira/browse/HBASE-11773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102753#comment-14102753 ] Hudson commented on HBASE-11773: SUCCESS: Integrated in HBase-TRUNK #5411 (See [https://builds.apache.org/job/HBase-TRUNK/5411/]) HBASE-11773 Wrong field used for protobuf construction in RegionStates (Andrey Stepachev) (apurtell: rev 393a2a3814a85e4b985aba89243101b23220eed1) * hbase-client/src/main/java/org/apache/hadoop/hbase/master/RegionState.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestRegionState.java Wrong field used for protobuf construction in RegionStates. --- Key: HBASE-11773 URL: https://issues.apache.org/jira/browse/HBASE-11773 Project: HBase Issue Type: Bug Components: Region Assignment Reporter: Andrey Stepachev Assignee: Andrey Stepachev Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11773-0.98.patch, HBASE-11773.patch Protobuf Java Pojo converter uses wrong field for converted enum construction (actually default value of protobuf message used). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11778) Scale timestamps by 1000
[ https://issues.apache.org/jira/browse/HBASE-11778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102756#comment-14102756 ] Lars Hofhansl commented on HBASE-11778: --- [~giacomotay...@gmail.com], FYI. Scale timestamps by 1000 Key: HBASE-11778 URL: https://issues.apache.org/jira/browse/HBASE-11778 Project: HBase Issue Type: Brainstorming Reporter: Lars Hofhansl The KV timestamps are used for various reasons: # ordering of KVs # resolving conflicts # enforce TTL Currently we assume that the timestamps have a resolution of 1ms, and because of that we made the resolution at which we can determine time identical to the resolution at which we can store time. I think it is time to disentangle the two... At least allow a higher resolution of time to be stored. That way we could have a centralized transaction oracle that produces ids that relate to wall clock time, and at the same time allow producing more than 1000/s. The simplest way is to just store time in us (microseconds). I.e. we'd still collect time in ms by default and just multiply that with 1000 before we store it. With 8 bytes that still gives us a range of 292471 years. We'd have grandfather in old data. Could write a metadata entry into each HFile declaring what the TS resolution is if it is different from ms. Not sure, yet, how this would relate to using the TS for things like seqIds. Let's do some brainstorming. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-11778) Scale timestamps by 1000
Lars Hofhansl created HBASE-11778: - Summary: Scale timestamps by 1000 Key: HBASE-11778 URL: https://issues.apache.org/jira/browse/HBASE-11778 Project: HBase Issue Type: Brainstorming Reporter: Lars Hofhansl The KV timestamps are used for various reasons: # ordering of KVs # resolving conflicts # enforce TTL Currently we assume that the timestamps have a resolution of 1ms, and because of that we made the resolution at which we can determine time identical to the resolution at which we can store time. I think it is time to disentangle the two... At least allow a higher resolution of time to be stored. That way we could have a centralized transaction oracle that produces ids that relate to wall clock time, and at the same time allow producing more than 1000/s. The simplest way is to just store time in us (microseconds). I.e. we'd still collect time in ms by default and just multiply that with 1000 before we store it. With 8 bytes that still gives us a range of 292471 years. We'd have grandfather in old data. Could write a metadata entry into each HFile declaring what the TS resolution is if it is different from ms. Not sure, yet, how this would relate to using the TS for things like seqIds. Let's do some brainstorming. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11778) Scale timestamps by 1000
[ https://issues.apache.org/jira/browse/HBASE-11778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated HBASE-11778: - Tags: Phoenix Scale timestamps by 1000 Key: HBASE-11778 URL: https://issues.apache.org/jira/browse/HBASE-11778 Project: HBase Issue Type: Brainstorming Reporter: Lars Hofhansl The KV timestamps are used for various reasons: # ordering of KVs # resolving conflicts # enforce TTL Currently we assume that the timestamps have a resolution of 1ms, and because of that we made the resolution at which we can determine time identical to the resolution at which we can store time. I think it is time to disentangle the two... At least allow a higher resolution of time to be stored. That way we could have a centralized transaction oracle that produces ids that relate to wall clock time, and at the same time allow producing more than 1000/s. The simplest way is to just store time in us (microseconds). I.e. we'd still collect time in ms by default and just multiply that with 1000 before we store it. With 8 bytes that still gives us a range of 292471 years. We'd have grandfather in old data. Could write a metadata entry into each HFile declaring what the TS resolution is if it is different from ms. Not sure, yet, how this would relate to using the TS for things like seqIds. Let's do some brainstorming. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11165) Scaling so cluster can host 1M regions and beyond (50M regions?)
[ https://issues.apache.org/jira/browse/HBASE-11165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102764#comment-14102764 ] stack commented on HBASE-11165: --- bq. If split meta, then 1) Less write amplification (ie no large compactions) ... Good point. i.e. if we want to move to lots of small regions, it would be odd if there was an except for meta clause. bq. Better W throughput. If Master is only writer, we'd need to ensure we are writing in // (i.e. Virag's recent patches). bq. 2) More disks, more R/W throughput. Yes. bq. More heap to fit meta... More heap to cache meta, yes. bq. ...We need to do experiments for 1 rack and 2 rack failure... Agreed that in time of catastrophic part-failure, we'd need the better R/W throughput a split meta can give you. Other pluses are we would treat meta like any other table. Negatives are we need our root back and startup is more complicated (but at least all inside single master in this case). In https://docs.google.com/document/d/1xC-bCzAAKO59Xo3XN-Cl6p-5CM_4DMoR-WpnkmYZgpw/edit# I (and others) argue for colocated meta and master going forward looking at options. Let me freshen it with arguments made here. Colocating meta and master has nice properties. The in-memory image of the cluster layout -- probably a severe sub-set of what is actually in meta -- would need to fit a single-server's RAM in either model. When colocated, operations are faster, less prone-to-error when less RPC involved (We'd still be subject to http://writings.quilt.org/2014/05/12/distributed-systems-and-the-end-of-the-api/ if persisting meta in hdfs as francis notes above). A single machine hosting single meta would not be able to service a 50M region startup with hundreds or regionservers as well as a deploy with split meta. It could. It'd just be slower. Colocated meta and master implies single meta forever and that single meta is served by one server only -- a 50M meta region would be an anomaly in the cluster being bigger than all the rest -- and until we have HBASE-10295 Refactor the replication implementation to eliminate permanent zk node and/or HBASE-11467 New impl of Registry interface not using ZK + new RPCs on master protocol (Maybe a later phase of HBASE-10070 when followers can run closer in to the leader state would work here) or a new master layout where we partition meta across multiple master server. A plus split meta has over colocated master and meta is that master currently can be down for some period of time and the cluster keeps working; no splits and no merges and if a machine crashes while master is down, data is offline till master comes back (needs more exercise). This is less the case when colocated master and meta. Please pile on all with thoughts. We need to put stake in grounds soon for hbase 2.0 cluster topology. Francis needs something in 0.98 timeframe. If the 0.98 is different to what folks want for 2.0, as per Andy lets split this issue. Thoughts-for-the-day: + HBase is supposed to be able to scale + Single meta came about because way back, we were too lazy to fix issues that arose when meta was split (at the time, we didn't need to scale as much). Scaling so cluster can host 1M regions and beyond (50M regions?) Key: HBASE-11165 URL: https://issues.apache.org/jira/browse/HBASE-11165 Project: HBase Issue Type: Brainstorming Reporter: stack Attachments: HBASE-11165.zip, Region Scalability test.pdf, zk_less_assignment_comparison_2.pdf This discussion issue comes out of Co-locate Meta And Master HBASE-10569 and comments on the doc posted there. A user -- our Francis Liu -- needs to be able to scale a cluster to do 1M regions maybe even 50M later. This issue is about discussing how we will do that (or if not 50M on a cluster, how otherwise we can attain same end). More detail to follow. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11778) Scale timestamps by 1000
[ https://issues.apache.org/jira/browse/HBASE-11778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102771#comment-14102771 ] stack commented on HBASE-11778: --- [~lhofhansl] see https://issues.apache.org/jira/browse/HBASE-8927 Could we do it for 1.0? Scale timestamps by 1000 Key: HBASE-11778 URL: https://issues.apache.org/jira/browse/HBASE-11778 Project: HBase Issue Type: Brainstorming Reporter: Lars Hofhansl The KV timestamps are used for various reasons: # ordering of KVs # resolving conflicts # enforce TTL Currently we assume that the timestamps have a resolution of 1ms, and because of that we made the resolution at which we can determine time identical to the resolution at which we can store time. I think it is time to disentangle the two... At least allow a higher resolution of time to be stored. That way we could have a centralized transaction oracle that produces ids that relate to wall clock time, and at the same time allow producing more than 1000/s. The simplest way is to just store time in us (microseconds). I.e. we'd still collect time in ms by default and just multiply that with 1000 before we store it. With 8 bytes that still gives us a range of 292471 years. We'd have grandfather in old data. Could write a metadata entry into each HFile declaring what the TS resolution is if it is different from ms. Not sure, yet, how this would relate to using the TS for things like seqIds. Let's do some brainstorming. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Reopened] (HBASE-11625) Reading datablock throws Invalid HFile block magic and can not switch to hdfs checksum
[ https://issues.apache.org/jira/browse/HBASE-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack reopened HBASE-11625: --- Reopening because Paul loaded up what was asked for. Reading datablock throws Invalid HFile block magic and can not switch to hdfs checksum - Key: HBASE-11625 URL: https://issues.apache.org/jira/browse/HBASE-11625 Project: HBase Issue Type: Bug Components: HFile Affects Versions: 0.94.21, 0.98.4, 0.98.5 Reporter: qian wang Attachments: 2711de1fdf73419d9f8afc6a8b86ce64.gz when using hbase checksum,call readBlockDataInternal() in hfileblock.java, it could happen file corruption but it only can switch to hdfs checksum inputstream till validateBlockChecksum(). If the datablock's header corrupted when b = new HFileBlock(),it throws the exception Invalid HFile block magic and the rpc call fail -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11742) Backport HBASE-7987 and HBASE-11185 to 0.98
[ https://issues.apache.org/jira/browse/HBASE-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102809#comment-14102809 ] Andrew Purtell commented on HBASE-11742: What happens if there is an older client library that wants to run a MR-over snapshots job against snapshots dropped by newer servers? The older library will be looking for table descriptors and snapshot region names in the FS instead of SnapshotFileInfo/SnapshotRegionManifest. Do we continue to handle TableSnapshotRegionSplit messages that have RegionSpecifiers (field #1) instead of the new TableSchema and RegionInfo fields (3 and 4)? What happens if someone uses the newer client in a MR-over-snapshots job where there is no SnapshotFileInfo/SnapshotRegionManifest data available because the servers are older? What happens when we have a mix of snapshots dropped by an older server side versus a newer server side? Backport HBASE-7987 and HBASE-11185 to 0.98 --- Key: HBASE-11742 URL: https://issues.apache.org/jira/browse/HBASE-11742 Project: HBase Issue Type: Improvement Components: mapreduce, snapshots Affects Versions: 0.98.5 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Fix For: 0.98.6 Attachments: HBASE-11742.v0.patch, HBASE-11742.v1.patch HBASE-7987 improves how snapshots are handled via a manifest file. This requires reverting HBASE-11360 since introduces an alternate functionality that is not compatible with HBASE-7987. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11607) Document HBase metrics
[ https://issues.apache.org/jira/browse/HBASE-11607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-11607: -- Resolution: Fixed Fix Version/s: 2.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Excellent. Committed w/ minor fixes [~misty] Thanks. Can figure how to dump out metrics at later date. Document HBase metrics -- Key: HBASE-11607 URL: https://issues.apache.org/jira/browse/HBASE-11607 Project: HBase Issue Type: Sub-task Components: documentation, metrics Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Labels: beginner Fix For: 2.0.0 Attachments: HBASE-11607.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-4920) We need a mascot, a totem
[ https://issues.apache.org/jira/browse/HBASE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102822#comment-14102822 ] stack commented on HBASE-4920: -- Let me stick something up on the list [~jmspaggi] I'm responsible for the mess here. We need a mascot, a totem - Key: HBASE-4920 URL: https://issues.apache.org/jira/browse/HBASE-4920 Project: HBase Issue Type: Task Reporter: stack Attachments: Apache_HBase_Orca_Logo_1.jpg, Apache_HBase_Orca_Logo_Mean_version-3.pdf, Apache_HBase_Orca_Logo_Mean_version-4.pdf, Apache_HBase_Orca_Logo_round5.pdf, HBase Orca Logo.jpg, Orca_479990801.jpg, Screen shot 2011-11-30 at 4.06.17 PM.png, apache hbase orca logo_Proof 3.pdf, apache logo_Proof 8.pdf, jumping-orca_rotated.xcf, jumping-orca_rotated_right.png, krake.zip, more_orcas.png, more_orcas2.png, orca_clipart_freevector_lhs.jpeg, orca_free_vector_on_top_66percent_levelled.png, orca_free_vector_sheared_rotated_rhs.png, orca_free_vector_some_selections.png, photo (2).JPG, plus_orca.png, proposal_1_logo.png, proposal_1_logo.xcf, proposal_2_logo.png, proposal_2_logo.xcf, proposal_3_logo.png, proposal_3_logo.xcf We need a totem for our t-shirt that is yet to be printed. O'Reilly owns the Clyesdale. We need something else. We could have a fluffy little duck that quacks 'hbase!' when you squeeze it and we could order boxes of them from some off-shore sweatshop that subcontracts to a contractor who employs child labor only. Or we could have an Orca (Big!, Fast!, Killer!, and in a poem that Marcy from Salesforce showed me, that was a bit too spiritual for me to be seen quoting here, it had the Orca as the 'Guardian of the Cosmic Memory': i.e. in translation, bigdata). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11742) Backport HBASE-7987 and HBASE-11185 to 0.98
[ https://issues.apache.org/jira/browse/HBASE-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102826#comment-14102826 ] Matteo Bertozzi commented on HBASE-11742: - {quote}What happens if there is an older client library that wants to run a MR-over snapshots job against snapshots dropped by newer servers? The older library will be looking for table descriptors and snapshot region names in the FS instead of SnapshotFileInfo/SnapshotRegionManifest.{quote} you can't use old jars to read the new format. {quote}Do we continue to handle TableSnapshotRegionSplit messages that have RegionSpecifiers (field #1) instead of the new TableSchema and RegionInfo fields (3 and 4)?{quote} That proto is internal to the map reduce and used just for message passing, so there is no need for compatibility there. Unless you have different jars executing the mr job. {quote}What happens if someone uses the newer client in a MR-over-snapshots job where there is no SnapshotFileInfo/SnapshotRegionManifest data available because the servers are older?{quote} The new code is able to read the old format {quote}What happens when we have a mix of snapshots dropped by an older server side versus a newer server side?{quote} The new code support taking snapshot during rolling upgrades, which means that the older RS writes in the old format and the new one in the new format. Which is fine since both format are readable and aggregated on read if necessary (requires the new jars). If the master is already updates will do the merge of the RS result converting to the single manifest if necessary. Backport HBASE-7987 and HBASE-11185 to 0.98 --- Key: HBASE-11742 URL: https://issues.apache.org/jira/browse/HBASE-11742 Project: HBase Issue Type: Improvement Components: mapreduce, snapshots Affects Versions: 0.98.5 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Fix For: 0.98.6 Attachments: HBASE-11742.v0.patch, HBASE-11742.v1.patch HBASE-7987 improves how snapshots are handled via a manifest file. This requires reverting HBASE-11360 since introduces an alternate functionality that is not compatible with HBASE-7987. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HBASE-11742) Backport HBASE-7987 and HBASE-11185 to 0.98
[ https://issues.apache.org/jira/browse/HBASE-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102850#comment-14102850 ] Andrew Purtell edited comment on HBASE-11742 at 8/19/14 9:12 PM: - {quote} bq. What happens if there is an older client library that wants to run a MR-over snapshots job against snapshots dropped by newer servers? The older library will be looking for table descriptors and snapshot region names in the FS instead of SnapshotFileInfo/SnapshotRegionManifest. you can't use old jars to read the new format. {quote} We can't force an upgrade of an older client with a minor server bump. The server side needs to support older clients until the client fleet can be upgraded independent of server side. Thanks for clarifying elsewhere, so this is the only issue with the current patch. Can we keep server side support for writing both formats with the backwards compatible option the default? Some new configuration setting will have a default of false (or 1). was (Author: apurtell): {quote} bq, What happens if there is an older client library that wants to run a MR-over snapshots job against snapshots dropped by newer servers? The older library will be looking for table descriptors and snapshot region names in the FS instead of SnapshotFileInfo/SnapshotRegionManifest. you can't use old jars to read the new format. {quote} We can't force an upgrade of an older client with a minor server bump. The server side needs to support older clients until the client fleet can be upgraded independent of server side. Thanks for clarifying elsewhere, so this is the only issue with the current patch. Can we keep server side support for writing both formats with the backwards compatible option the default? Some new configuration setting will have a default of false (or 1). Backport HBASE-7987 and HBASE-11185 to 0.98 --- Key: HBASE-11742 URL: https://issues.apache.org/jira/browse/HBASE-11742 Project: HBase Issue Type: Improvement Components: mapreduce, snapshots Affects Versions: 0.98.5 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Fix For: 0.98.6 Attachments: HBASE-11742.v0.patch, HBASE-11742.v1.patch HBASE-7987 improves how snapshots are handled via a manifest file. This requires reverting HBASE-11360 since introduces an alternate functionality that is not compatible with HBASE-7987. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11742) Backport HBASE-7987 and HBASE-11185 to 0.98
[ https://issues.apache.org/jira/browse/HBASE-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102850#comment-14102850 ] Andrew Purtell commented on HBASE-11742: {quote} bq, What happens if there is an older client library that wants to run a MR-over snapshots job against snapshots dropped by newer servers? The older library will be looking for table descriptors and snapshot region names in the FS instead of SnapshotFileInfo/SnapshotRegionManifest. you can't use old jars to read the new format. {quote} We can't force an upgrade of an older client with a minor server bump. The server side needs to support older clients until the client fleet can be upgraded independent of server side. Thanks for clarifying elsewhere, so this is the only issue with the current patch. Can we keep server side support for writing both formats with the backwards compatible option the default? Some new configuration setting will have a default of false (or 1). Backport HBASE-7987 and HBASE-11185 to 0.98 --- Key: HBASE-11742 URL: https://issues.apache.org/jira/browse/HBASE-11742 Project: HBase Issue Type: Improvement Components: mapreduce, snapshots Affects Versions: 0.98.5 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Fix For: 0.98.6 Attachments: HBASE-11742.v0.patch, HBASE-11742.v1.patch HBASE-7987 improves how snapshots are handled via a manifest file. This requires reverting HBASE-11360 since introduces an alternate functionality that is not compatible with HBASE-7987. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HBASE-11742) Backport HBASE-7987 and HBASE-11185 to 0.98
[ https://issues.apache.org/jira/browse/HBASE-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102850#comment-14102850 ] Andrew Purtell edited comment on HBASE-11742 at 8/19/14 9:14 PM: - {quote} bq. What happens if there is an older client library that wants to run a MR-over snapshots job against snapshots dropped by newer servers? The older library will be looking for table descriptors and snapshot region names in the FS instead of SnapshotFileInfo/SnapshotRegionManifest. you can't use old jars to read the new format. {quote} We can't force an upgrade of an older client with a minor server bump. The server side needs to support older clients until the client fleet can be upgraded independent of server side. Thanks for clarifying elsewhere, so this is the only issue with the current patch. Can we keep server side support for writing both formats with the backwards compatible option the default? Some new configuration setting will have a default of false (or 1). We can add a release note describing the necessary steps to move to the improved functionality. was (Author: apurtell): {quote} bq. What happens if there is an older client library that wants to run a MR-over snapshots job against snapshots dropped by newer servers? The older library will be looking for table descriptors and snapshot region names in the FS instead of SnapshotFileInfo/SnapshotRegionManifest. you can't use old jars to read the new format. {quote} We can't force an upgrade of an older client with a minor server bump. The server side needs to support older clients until the client fleet can be upgraded independent of server side. Thanks for clarifying elsewhere, so this is the only issue with the current patch. Can we keep server side support for writing both formats with the backwards compatible option the default? Some new configuration setting will have a default of false (or 1). Backport HBASE-7987 and HBASE-11185 to 0.98 --- Key: HBASE-11742 URL: https://issues.apache.org/jira/browse/HBASE-11742 Project: HBase Issue Type: Improvement Components: mapreduce, snapshots Affects Versions: 0.98.5 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Fix For: 0.98.6 Attachments: HBASE-11742.v0.patch, HBASE-11742.v1.patch HBASE-7987 improves how snapshots are handled via a manifest file. This requires reverting HBASE-11360 since introduces an alternate functionality that is not compatible with HBASE-7987. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11742) Backport HBASE-7987 and HBASE-11185 to 0.98
[ https://issues.apache.org/jira/browse/HBASE-11742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102860#comment-14102860 ] Matteo Bertozzi commented on HBASE-11742: - yeah, that should be easy. it just need a conf that set this property in SnapshotDescriptionUtil.java {code} public static final int SNAPSHOT_LAYOUT_VERSION = SnapshotManifestV2.DESCRIPTOR_VERSION; {code} let me do that Backport HBASE-7987 and HBASE-11185 to 0.98 --- Key: HBASE-11742 URL: https://issues.apache.org/jira/browse/HBASE-11742 Project: HBase Issue Type: Improvement Components: mapreduce, snapshots Affects Versions: 0.98.5 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Fix For: 0.98.6 Attachments: HBASE-11742.v0.patch, HBASE-11742.v1.patch HBASE-7987 improves how snapshots are handled via a manifest file. This requires reverting HBASE-11360 since introduces an alternate functionality that is not compatible with HBASE-7987. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11682) Explain hotspotting
[ https://issues.apache.org/jira/browse/HBASE-11682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102867#comment-14102867 ] Nick Dimiduk commented on HBASE-11682: -- Very well articulated example, I like it! [~jmhsieh] you're right in that I don't think of using random data for a prefix because the nondeterminism makes gets ineffective. It is, however, a valid approach. {noformat} +paraSuppose you have the following list of row keys:/para {noformat} This example assumes the table is split in a way such that f* would be in a single region but a-, b-, c-, d- are in different regions. Be explicit about the region splits, include a sentence like assume your table is split by letter, so the rowkey prefix {{a}} is on one region, {{b}} is on a second, {{c}} on a 3rd, c. In that topology, then all the foo rows would be in the same region, and the prefixed rows are in different regions. {noformat} +titleHashing/title {noformat} For this bit, you can add something like using a deterministic hash allows the client to reconstruct the complete rowkey and use a get operation to retrieve that row as normal. The current text alludes to this, but maybe we can some out and say it explicitly. For references, you could also link off to Phoenix's Salted Tables description http://phoenix.apache.org/salted.html Explain hotspotting --- Key: HBASE-11682 URL: https://issues.apache.org/jira/browse/HBASE-11682 Project: HBase Issue Type: Task Components: documentation Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Attachments: HBASE-11682-1.patch, HBASE-11682.patch, HBASE-11682.patch, HBASE-11682.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11682) Explain hotspotting
[ https://issues.apache.org/jira/browse/HBASE-11682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102875#comment-14102875 ] Jonathan Hsieh commented on HBASE-11682: +1 to NIck's clarifications Explain hotspotting --- Key: HBASE-11682 URL: https://issues.apache.org/jira/browse/HBASE-11682 Project: HBase Issue Type: Task Components: documentation Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Attachments: HBASE-11682-1.patch, HBASE-11682.patch, HBASE-11682.patch, HBASE-11682.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11735) Document Configurable Bucket Sizes in bucketCache
[ https://issues.apache.org/jira/browse/HBASE-11735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102883#comment-14102883 ] Misty Stanley-Jones commented on HBASE-11735: - What do you think, [~stack]? Document Configurable Bucket Sizes in bucketCache - Key: HBASE-11735 URL: https://issues.apache.org/jira/browse/HBASE-11735 Project: HBase Issue Type: Task Components: documentation Affects Versions: 0.99.0, 0.98.4, 0.98.5 Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Fix For: 0.99.0, 0.98.6 Attachments: HBASE-11735.patch, HBASE-11735.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11682) Explain hotspotting
[ https://issues.apache.org/jira/browse/HBASE-11682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misty Stanley-Jones updated HBASE-11682: Attachment: HBASE-11682.patch Thanks [~ndimiduk], how's this? Explain hotspotting --- Key: HBASE-11682 URL: https://issues.apache.org/jira/browse/HBASE-11682 Project: HBase Issue Type: Task Components: documentation Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Attachments: HBASE-11682-1.patch, HBASE-11682.patch, HBASE-11682.patch, HBASE-11682.patch, HBASE-11682.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11737) Document callQueue improvements from HBASE-11355 and HBASE-11724
[ https://issues.apache.org/jira/browse/HBASE-11737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102885#comment-14102885 ] Misty Stanley-Jones commented on HBASE-11737: - What do you think, [~mbertozzi]? Document callQueue improvements from HBASE-11355 and HBASE-11724 Key: HBASE-11737 URL: https://issues.apache.org/jira/browse/HBASE-11737 Project: HBase Issue Type: Sub-task Components: documentation Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Fix For: 0.99.0, 0.98.4 Attachments: HBASE-11737.patch, HBASE-11737.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11752) Document blockcache prefetch option
[ https://issues.apache.org/jira/browse/HBASE-11752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102887#comment-14102887 ] Misty Stanley-Jones commented on HBASE-11752: - Maybe [~stack] can have a look as well? Document blockcache prefetch option --- Key: HBASE-11752 URL: https://issues.apache.org/jira/browse/HBASE-11752 Project: HBase Issue Type: Sub-task Components: BlockCache, documentation Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Fix For: 0.99.0, 0.98.3 Attachments: HBASE-11752.patch, HBASE-11752.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11773) Wrong field used for protobuf construction in RegionStates.
[ https://issues.apache.org/jira/browse/HBASE-11773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102906#comment-14102906 ] Hudson commented on HBASE-11773: SUCCESS: Integrated in HBase-0.98-on-Hadoop-1.1 #431 (See [https://builds.apache.org/job/HBase-0.98-on-Hadoop-1.1/431/]) HBASE-11773 Wrong field used for protobuf construction in RegionStates (Andrey Stepachev) (apurtell: rev dbda5c38feb28aef2ee3829264cbe39af54c958d) * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestRegionState.java * hbase-client/src/main/java/org/apache/hadoop/hbase/master/RegionState.java Wrong field used for protobuf construction in RegionStates. --- Key: HBASE-11773 URL: https://issues.apache.org/jira/browse/HBASE-11773 Project: HBase Issue Type: Bug Components: Region Assignment Reporter: Andrey Stepachev Assignee: Andrey Stepachev Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11773-0.98.patch, HBASE-11773.patch Protobuf Java Pojo converter uses wrong field for converted enum construction (actually default value of protobuf message used). -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11735) Document Configurable Bucket Sizes in bucketCache
[ https://issues.apache.org/jira/browse/HBASE-11735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-11735: -- Resolution: Fixed Fix Version/s: 2.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I removed the bit about hbase.offheapcache.percentage. It is removed and even where it is present, it is a PITA. Nice job [~misty] Thanks. Document Configurable Bucket Sizes in bucketCache - Key: HBASE-11735 URL: https://issues.apache.org/jira/browse/HBASE-11735 Project: HBase Issue Type: Task Components: documentation Affects Versions: 0.99.0, 0.98.4, 0.98.5 Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Fix For: 0.99.0, 2.0.0, 0.98.6 Attachments: HBASE-11735.patch, HBASE-11735.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-4955) Use the official versions of surefire junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Newman updated HBASE-4955: --- Attachment: HBASE-4955-v10.patch Use the official versions of surefire junit - Key: HBASE-4955 URL: https://issues.apache.org/jira/browse/HBASE-4955 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0, 0.98.0, 0.96.0, 0.99.0 Environment: all Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Critical Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 4955.v2.patch, 4955.v2.patch, 4955.v3.patch, 4955.v3.patch, 4955.v3.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v5.patch, 4955.v6.patch, 4955.v7.patch, 4955.v7.patch, 4955.v8.patch, 4955.v9.patch, 8204.v4.patch, HBASE-4955-v10.patch We currently use private versions for Surefire JUnit since HBASE-4763. This JIRA traks what we need to move to official versions. Surefire 2.11 is just out, but, after some tests, it does not contain all what we need. JUnit. Could be for JUnit 4.11. Issue to monitor: https://github.com/KentBeck/junit/issues/359: fixed in our version, no feedback for an integration on trunk Surefire: Could be for Surefire 2.12. Issues to monitor are: 329 (category support): fixed, we use the official implementation from the trunk 786 (@Category with forkMode=always): fixed, we use the official implementation from the trunk 791 (incorrect elapsed time on test failure): fixed, we use the official implementation from the trunk 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on our version. 760 (does not take into account the test method): fixed in trunk, not fixed in our version 798 (print immediately the test class name): not fixed in trunk, not fixed in our version 799 (Allow test parallelization when forkMode=always): not fixed in trunk, not fixed in our version 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, fixed on our version 800 793 are the more important to monitor, it's the only ones that are fixed in our version but not on trunk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11682) Explain hotspotting
[ https://issues.apache.org/jira/browse/HBASE-11682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102921#comment-14102921 ] Nick Dimiduk commented on HBASE-11682: -- A little nit-picky, but... (now you know what Aman went through ;) ) {noformat} +paraSuppose you have the following list of row keys, and your table is split in such a way + that all the rows starting with foo are in the same region./para {noformat} I would say ... and your table is split such that there is one region for each letter of the alphabet -- prefix 'a' is one region, prefix 'b' is another. In this table, all rows starting with 'f' are in the same region. That is, be explicitly clear about the region split for the example. {noformat} +an link xlink:href=http://phoenix.apache.org/salted.html;article on Salted Tables/link {noformat} an should be and ? Explain hotspotting --- Key: HBASE-11682 URL: https://issues.apache.org/jira/browse/HBASE-11682 Project: HBase Issue Type: Task Components: documentation Reporter: Misty Stanley-Jones Assignee: Misty Stanley-Jones Attachments: HBASE-11682-1.patch, HBASE-11682.patch, HBASE-11682.patch, HBASE-11682.patch, HBASE-11682.patch -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-4955) Use the official versions of surefire junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14102924#comment-14102924 ] Alex Newman commented on HBASE-4955: OK I disabled that. It seems to work ok on our build server. I'd be interested to see how apache build is working. Use the official versions of surefire junit - Key: HBASE-4955 URL: https://issues.apache.org/jira/browse/HBASE-4955 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0, 0.98.0, 0.96.0, 0.99.0 Environment: all Reporter: Nicolas Liochon Assignee: Nicolas Liochon Priority: Critical Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 4955.v2.patch, 4955.v2.patch, 4955.v3.patch, 4955.v3.patch, 4955.v3.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v5.patch, 4955.v6.patch, 4955.v7.patch, 4955.v7.patch, 4955.v8.patch, 4955.v9.patch, 8204.v4.patch, HBASE-4955-v10.patch We currently use private versions for Surefire JUnit since HBASE-4763. This JIRA traks what we need to move to official versions. Surefire 2.11 is just out, but, after some tests, it does not contain all what we need. JUnit. Could be for JUnit 4.11. Issue to monitor: https://github.com/KentBeck/junit/issues/359: fixed in our version, no feedback for an integration on trunk Surefire: Could be for Surefire 2.12. Issues to monitor are: 329 (category support): fixed, we use the official implementation from the trunk 786 (@Category with forkMode=always): fixed, we use the official implementation from the trunk 791 (incorrect elapsed time on test failure): fixed, we use the official implementation from the trunk 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on our version. 760 (does not take into account the test method): fixed in trunk, not fixed in our version 798 (print immediately the test class name): not fixed in trunk, not fixed in our version 799 (Allow test parallelization when forkMode=always): not fixed in trunk, not fixed in our version 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, fixed on our version 800 793 are the more important to monitor, it's the only ones that are fixed in our version but not on trunk. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (HBASE-4955) Use the official versions of surefire junit
[ https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Newman reassigned HBASE-4955: -- Assignee: Alex Newman (was: Nicolas Liochon) Use the official versions of surefire junit - Key: HBASE-4955 URL: https://issues.apache.org/jira/browse/HBASE-4955 Project: HBase Issue Type: Improvement Components: test Affects Versions: 0.94.0, 0.98.0, 0.96.0, 0.99.0 Environment: all Reporter: Nicolas Liochon Assignee: Alex Newman Priority: Critical Attachments: 4955.v1.patch, 4955.v2.patch, 4955.v2.patch, 4955.v2.patch, 4955.v2.patch, 4955.v3.patch, 4955.v3.patch, 4955.v3.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v4.patch, 4955.v5.patch, 4955.v6.patch, 4955.v7.patch, 4955.v7.patch, 4955.v8.patch, 4955.v9.patch, 8204.v4.patch, HBASE-4955-v10.patch We currently use private versions for Surefire JUnit since HBASE-4763. This JIRA traks what we need to move to official versions. Surefire 2.11 is just out, but, after some tests, it does not contain all what we need. JUnit. Could be for JUnit 4.11. Issue to monitor: https://github.com/KentBeck/junit/issues/359: fixed in our version, no feedback for an integration on trunk Surefire: Could be for Surefire 2.12. Issues to monitor are: 329 (category support): fixed, we use the official implementation from the trunk 786 (@Category with forkMode=always): fixed, we use the official implementation from the trunk 791 (incorrect elapsed time on test failure): fixed, we use the official implementation from the trunk 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on our version. 760 (does not take into account the test method): fixed in trunk, not fixed in our version 798 (print immediately the test class name): not fixed in trunk, not fixed in our version 799 (Allow test parallelization when forkMode=always): not fixed in trunk, not fixed in our version 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, fixed on our version 800 793 are the more important to monitor, it's the only ones that are fixed in our version but not on trunk. -- This message was sent by Atlassian JIRA (v6.2#6252)