[jira] [Resolved] (HBASE-26479) Print too slow/big scan's operation_id in region server log

2021-11-24 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-26479.
---
Fix Version/s: 2.5.0
   Resolution: Fixed

Reapplied after fixing the compile issue.

Thanks [~apurtell].

> Print too slow/big scan's operation_id in region server log
> ---
>
> Key: HBASE-26479
> URL: https://issues.apache.org/jira/browse/HBASE-26479
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver, scan
>Reporter: Xuesen Liang
>Assignee: Xuesen Liang
>Priority: Minor
> Fix For: 2.5.0, 3.0.0-alpha-2
>
> Attachments: HBASE-26479.patch
>
>
> Tracing is very important in large-scale distributed systems.
> The attribute *__operation.attributes.id_*  of a scan request can be used as 
> trace id between hbase client and region server.
> We should print operation id in region server's log if the scan request is 
> too slow or too big.
> It will be very helpful for finding problematic requests.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Reopened] (HBASE-26479) Print too slow/big scan's operation_id in region server log

2021-11-24 Thread Andrew Kyle Purtell (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Kyle Purtell reopened HBASE-26479:
-

The commit to branch-2 breaks compilation. (branch-2 has HBaseTestingUtility, 
not HBaseTestingUtil). Reverted commit from branch-2.

> Print too slow/big scan's operation_id in region server log
> ---
>
> Key: HBASE-26479
> URL: https://issues.apache.org/jira/browse/HBASE-26479
> Project: HBase
>  Issue Type: Improvement
>  Components: regionserver, scan
>Reporter: Xuesen Liang
>Assignee: Xuesen Liang
>Priority: Minor
> Fix For: 2.5.0, 3.0.0-alpha-2
>
> Attachments: HBASE-26479.patch
>
>
> Tracing is very important in large-scale distributed systems.
> The attribute *__operation.attributes.id_*  of a scan request can be used as 
> trace id between hbase client and region server.
> We should print operation id in region server's log if the scan request is 
> too slow or too big.
> It will be very helpful for finding problematic requests.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HBASE-26485) Introduce a method to clean restore directory after Snapshot Scan

2021-11-24 Thread ruanhui (Jira)
ruanhui created HBASE-26485:
---

 Summary: Introduce a method to clean restore directory after 
Snapshot Scan
 Key: HBASE-26485
 URL: https://issues.apache.org/jira/browse/HBASE-26485
 Project: HBase
  Issue Type: Improvement
  Components: snapshots
Reporter: ruanhui
Assignee: ruanhui


SnapshotScan is widely used in our company. However, after the snapshot scan 
job, the restore directory is not cleaned, and this maybe puts a lot of 
pressure on HDFS after a long time. So maybe we can introduce a method for 
users to clean the snapshot restore directory after job.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HBASE-26482) HMaster may clean wals that is replicating in rare cases

2021-11-24 Thread Duo Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duo Zhang resolved HBASE-26482.
---
Hadoop Flags: Reviewed
  Resolution: Fixed

Pushed to branch-2.4+.

Thanks [~zhengzhuobinzzb] for contributing.

> HMaster may clean wals that is replicating in rare cases
> 
>
> Key: HBASE-26482
> URL: https://issues.apache.org/jira/browse/HBASE-26482
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: zhuobin zheng
>Assignee: zhuobin zheng
>Priority: Critical
> Fix For: 2.5.0, 3.0.0-alpha-2, 2.4.9
>
>
> In our cluster, i can found some FileNotFoundException when 
> ReplicationSourceWALReader running for replication recovery queue.
> I guss the wal most likely removed by hmaste. And i found something to 
> support it.
> The method getAllWALs: 
> [https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509
>    
> |https://github.com/apache/hbase/blob/master/hbase-replication/src/main/java/org/apache/hadoop/hbase/replication/ZKReplicationQueueStorage.java#L509]Use
>  zk cversion of /hbase/replication/rs as an optimistic lock to control 
> concurrent ops.
> But, zk cversion *only can only reflect the changes of child nodes, but not 
> the changes of grandchildren.*
> So, HMaster may loss some wal from this method in follow situation.
>  # HMaster do log clean , and invoke getAllWALs to filter log which should 
> not be deleted.
>  # HMaster cache current cversion of /hbase/replication/rs  as *v0*
>  # HMaster cache all RS server name, and traverse them, get the WAL in each 
> Queue
>  # *RS2* dead after HMaster traverse {*}RS1{*}, and before traverse *RS2*
>  # *RS1* claim one queue of *RS2,* which named *peerid-RS2* now
>  # By the way , the cversion of /hbase/replication/rs not changed before all 
> of *RS2* queue is removed, because the children of /hbase/replication/rs not 
> change.
>  # So, Hmaster will lost the wals in *peerid-RS2,* because we have already 
> traversed *RS1 ,* and ** this queue not exists in *RS2*
> The above expression is currently only speculation, not confirmed
> Flie Not Found Log.
>  
> {code:java}
> // code placeholder
> 2021-11-22 15:18:39,593 ERROR 
> [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348]
>  regionserver.WALEntryStream: Couldn't locate log: 
> hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704
> 2021-11-22 15:18:39,593 ERROR 
> [ReplicationExecutor-0.replicationSource,peer_id-hostname,60020,1636802867348.replicationSource.wal-reader.hostname%2C60020%2C1636802867348,peer_id-hostname,60020,1636802867348]
>  regionserver.ReplicationSourceWALReader: Failed to read stream of 
> replication entries
> java.io.FileNotFoundException: File does not exist: 
> hdfs://namenode/hbase/oldWALs/hostname%2C60020%2C1636802867348.1636944748704
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1612)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1605)
>         at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>         at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1620)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:64)
>         at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:168)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:321)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:303)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:291)
>         at 
> org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:427)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openReader(WALEntryStream.java:355)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.openNextLog(WALEntryStream.java:303)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.checkReader(WALEntryStream.java:294)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.tryAdvanceEntry(WALEntryStream.java:175)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.WALEntryStream.hasNext(WALEntryStream.java:101)
>         at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceWALReader.readWALEntries(ReplicationSourceWALReader.java:192)
>         at 
> org.apache.hadoop.hbase.replication.reg

Re: RegionSize returning in MB - change to bytes?

2021-11-24 Thread Peter Somogyi
Norbert made the change that Duo recommended:
https://github.com/apache/hbase/pull/3872
I'll wait a few days in case someone wants to review it as well.

On Sun, Oct 17, 2021 at 6:21 PM 张铎(Duo Zhang)  wrote:

> I think the problem is here:
>
>
> https://github.com/apache/hbase/blob/736f3e77c8d13719fc48e04876368d3494699280/hbase-protocol-shaded/src/main/protobuf/server/ClusterStatus.proto#L98
>
> We only store the storefile_size_MB in RegionLoad, so even if you use get
> Unit.Byte, you can not get the actual size in bytes...
>
> Maybe we could always round the size to 1MB if the region size is greater
> than 0 but less than 1MB?
>
> Nick Dimiduk  于2021年10月14日周四 上午1:51写道:
>
> > Maybe the master is only tracking sizes in mb because that’s what is sent
> > from the region servers via heartbeat protocol? Reducing the load over
> the
> > wire makes sense as a scalability concern.
> >
> > On Wed, Oct 13, 2021 at 03:45 Norbert Kalmar
>  > >
> > wrote:
> >
> > > Thank you for the inputs.
> > > Yes, we do return in bytes, but it is multiplied from MB so we get a
> > false
> > > size of 0 below 1MB.
> > >
> > > I checked HBASE-16169, even before that patch we used MB in
> > > REgionSizeCalculator, the patch just kept the idea of using MB as a
> unit.
> > >
> > > I will map every usage related to region size and try to figure out the
> > > reason why MB is the standardize unit. And after all, we can always
> > convert
> > > to MB - as we do convert to Bytes right now. But this way we wouldn't
> > lose
> > > the precise size due to conversation if we need the size in Bytes.
> > >
> > > I'll update my PR once I think I figured out the solution.
> > >
> > > - Norbert
> > >
> > >
> > >
> > > On Wed, Oct 13, 2021 at 4:59 AM 张铎(Duo Zhang) 
> > > wrote:
> > >
> > > > The return value is in bytes, the problem is that we normalize the
> size
> > > in
> > > > MB and then multiply MB to get the size in bytes, so if a file is
> less
> > > than
> > > > 1MB, the returned value will be zero.
> > > >
> > > > Need to investigate more here.
> > > >
> > > > Reading the issue, the scalable problem they wanted to solve is that
> we
> > > > will go to master to get the region size, not about whether the unit
> is
> > > in
> > > > MB or not.
> > > >
> > > > Thanks.
> > > >
> > > > Nick Dimiduk  于2021年10月13日周三 上午7:47写道:
> > > >
> > > > > Hi Norbert,
> > > > >
> > > > > To answer your question directly: the RegionSizeCalculator class is
> > > > > annotated with @InterfaceAudience.Private, which means there's a
> good
> > > > > chance that it's implementation can be changed without need for a
> > > > > deprecation cycle and user participation.
> > > > >
> > > > > Curiously, I noticed that this `sizeMap` is accessed down in the
> > method
> > > > > `long getRegionSize(byte[])`, and its javadoc mentions the returned
> > > unit
> > > > > explicitly as bytes.
> > > > >
> > > > > So with a little investigation using git blame, I see that the
> switch
> > > > from
> > > > > returning values in bytes to values in megabytes came in through
> > > > > HBASE-16169 -- your proposed change was the old implementation. For
> > > > > whatever reasons, it was determined to not be scalable. So, we
> could
> > > > revert
> > > > > back, but we'd need some new solution to what HBASE-16169 aimed to
> > > solve.
> > > > >
> > > > > I hope this helps.
> > > > >
> > > > > Thanks,
> > > > > Nick
> > > > >
> > > > > On Tue, Oct 12, 2021 at 10:54 AM Norbert Kalmar <
> nkal...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > There is a new optimization in spark (SPARK-34809) where
> > > > > ignoreEmptySplits
> > > > > > filters out all regions that's size is 0. They use a hadoop
> library
> > > > > > getSize() in TableInputFormat.
> > > > > >
> > > > > > Drilling down, this will return Bytes, but it converts it from
> > > > MegaBytes
> > > > > -
> > > > > > meaning anything under 1 MB will come down as 0 Bytes, meaning
> > empty.
> > > > > > I did a quick PR I thought would help:
> > > > > > https://github.com/apache/hbase/pull/3737
> > > > > > But it turns out it's not as easy as requesting the size in Bytes
> > > > instead
> > > > > > of MB from Size class, as we set it in MB te begin with in
> > > > > > RegionMetricsBuilder
> > > > > > -> setStoreFileSize(new Size(regionLoadPB.getStorefileSizeMB(),
> > > > > > Size.Unit.MEGABYTE))
> > > > > >
> > > > > > I did some testing, and inserting a few kilobytes of data, then
> > > > > > calling list_regions
> > > > > > will in fact give back size 0.
> > > > > >
> > > > > > My question is, is it okay to store the region size in Bytes
> > instead?
> > > > > > Mainly asking because of backward compatibility reasons.
> > > > > >
> > > > > > Regards,
> > > > > > Norbert
> > > > > >
> > > > >
> > > >
> > >
> >
>