[jira] [Created] (HBASE-28400) WAL readers treat any exception as EOFException, which can lead to data loss

2024-02-23 Thread Bryan Beaudreault (Jira)
Bryan Beaudreault created HBASE-28400:
-

 Summary: WAL readers treat any exception as EOFException, which 
can lead to data loss
 Key: HBASE-28400
 URL: https://issues.apache.org/jira/browse/HBASE-28400
 Project: HBase
  Issue Type: Bug
Reporter: Bryan Beaudreault


In HBASE-28390, I found a bug in our WAL compression which manifests as an 
IllegalArgumentException or ArrayIndexOutOfBoundException. Even worse is that 
ProtobufLogReader.readNext catches any Exception and rethrows it as an 
EOFException. EOFException gets handled in a variety of ways by the readers of 
WALs, and not all of them make sense for an exception that isn't really EOF.

For example, WALInputFormat catches EOFException and returns false for 
nextKeyValue(), effectively skipping the rest of the WAL file but not failing 
the job.

ReplicationSourceWALReader has some much more complicated handling of 
EOFException.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-28390) WAL value compression fails for cells with large values

2024-02-23 Thread Bryan Beaudreault (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-28390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Beaudreault resolved HBASE-28390.
---
Fix Version/s: 2.6.0
   2.5.8
   3.0.0-beta-2
 Assignee: Bryan Beaudreault
   Resolution: Fixed

Pushed to branch-2.5+. Thanks [~apurtell] for the review

> WAL value compression fails for cells with large values
> ---
>
> Key: HBASE-28390
> URL: https://issues.apache.org/jira/browse/HBASE-28390
> Project: HBase
>  Issue Type: Bug
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.6.0, 2.5.8, 3.0.0-beta-2
>
>
> We are testing out WAL compression and noticed that it fails for large values 
> when both features (wal compression and wal value compression) are enabled. 
> It works fine with either feature independently, but not when combined. It 
> seems to fail for all of the value compressor types, and the failure is in 
> the LRUDictionary of wal key compression:
>  
> {code:java}
> java.io.IOException: Error  while reading 2 WAL KVs; started reading at 230 
> and read up to 396
>     at 
> org.apache.hadoop.hbase.regionserver.wal.ProtobufWALStreamReader.next(ProtobufWALStreamReader.java:94)
>  ~[classes/:?]
>     at 
> org.apache.hadoop.hbase.wal.CompressedWALTestBase.doTest(CompressedWALTestBase.java:181)
>  ~[test-classes/:?]
>     at 
> org.apache.hadoop.hbase.wal.CompressedWALTestBase.testForSize(CompressedWALTestBase.java:129)
>  ~[test-classes/:?]
>     at 
> org.apache.hadoop.hbase.wal.CompressedWALTestBase.testLarge(CompressedWALTestBase.java:94)
>  ~[test-classes/:?]
>     at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:?]
>     at 
> jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  ~[?:?]
>     at 
> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:?]
>     at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
>     at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>  ~[junit-4.13.2.jar:4.13.2]
>     at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  ~[junit-4.13.2.jar:4.13.2]
>     at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>  ~[junit-4.13.2.jar:4.13.2]
>     at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  ~[junit-4.13.2.jar:4.13.2]
>     at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) 
> ~[junit-4.13.2.jar:4.13.2]
>     at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>  ~[junit-4.13.2.jar:4.13.2]
>     at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) 
> ~[junit-4.13.2.jar:4.13.2]
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>  ~[junit-4.13.2.jar:4.13.2]
>     at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>  ~[junit-4.13.2.jar:4.13.2]
>     at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) 
> ~[junit-4.13.2.jar:4.13.2]
>     at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) 
> ~[junit-4.13.2.jar:4.13.2]
>     at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) 
> ~[junit-4.13.2.jar:4.13.2]
>     at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) 
> ~[junit-4.13.2.jar:4.13.2]
>     at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) 
> ~[junit-4.13.2.jar:4.13.2]
>     at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
> ~[junit-4.13.2.jar:4.13.2]
>     at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
> ~[junit-4.13.2.jar:4.13.2]
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>  ~[junit-4.13.2.jar:4.13.2]
>     at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>  ~[junit-4.13.2.jar:4.13.2]
>     at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
>     at java.lang.Thread.run(Thread.java:829) ~[?:?]
> Caused by: java.lang.IndexOutOfBoundsException: index (21) must be less than 
> size (1)
>     at 
> org.apache.hbase.thirdparty.com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:1371)
>  ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5]
>     at 
> org.apache.hbase.thirdparty.com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:1353)
>  ~[hbase-shaded-miscellaneous-4.1.5.jar:4.1.5]
>     at 
> org.apache.hadoop.hbase.io.util.LRUDictiona

Re: [DISCUSS] Enabling JDK17 build pipelines

2024-02-23 Thread Nick Dimiduk
+1

On Fri, Feb 23, 2024 at 4:34 AM 张铎(Duo Zhang)  wrote:

> Agree on Andrew's opinion.
>
> We could enable it for master and branch-3 first.
> I will help with stablizing.
>
> Thanks.
>
> Andrew Purtell  于2024年2月23日周五 02:14写道:
> >
> > > Would like to gather opinion on which branches should we
> prioritize/enable
> > > JDK 17 pipelines
> >
> > master
> > branch-3
> >
> > When the JDK 17 pipelines have stabilized for those, then consider
> branch-2
> > and the release branches. Probably by then most issues would have a prior
> > developed solution that can be easily adapted or cherry picked.
> >
> > On Thu, Feb 22, 2024 at 9:40 AM rajeshb...@apache.org <
> > chrajeshbab...@gmail.com> wrote:
> >
> > > Hi Team,
> > >
> > > As all the active branches have support for JDK17 it would be better to
> > > align build pipelines, pre-commit jobs and nightlies accordingly.
> > >
> > > I have raised PR to support the same
> > > https://github.com/apache/hbase/pull/5689.
> > >
> > > Would like to gather opinion on which branches should we
> prioritize/enable
> > > JDK 17 pipelines.
> > >
> > > Feel free to share your thoughts.
> > >
> > > Thanks,
> > > Rajeshbabu.
> > >
> >
> >
> > --
> > Best regards,
> > Andrew
> >
> > Unrest, ignorance distilled, nihilistic imbeciles -
> > It's what we’ve earned
> > Welcome, apocalypse, what’s taken you so long?
> > Bring us the fitting end that we’ve been counting on
> >- A23, Welcome, Apocalypse
>


[jira] [Created] (HBASE-28399) region size can be wrong from RegionSizeCalculator

2024-02-23 Thread ruanhui (Jira)
ruanhui created HBASE-28399:
---

 Summary: region size can be wrong from RegionSizeCalculator
 Key: HBASE-28399
 URL: https://issues.apache.org/jira/browse/HBASE-28399
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 3.0.0-beta-1
Reporter: ruanhui
Assignee: ruanhui
 Fix For: 3.0.0-beta-2


The RegionSizeCalculator calculates region byte size using the following method
{code:java}
private static final long MEGABYTE = 1024L * 1024L;
long regionSizeBytes =
  ((long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE)) * MEGABYTE; 
{code}
However, this method will lose accuracy. For example, the result of 
{code:java}
((long) new Size(1, Size.Unit.BYTE).get(Size.Unit.MEGABYTE)) * MEGABYTE {code}
is 0. This will result in a TableInputSplit with a length of 0, but in fact 
this TableInputSplit has a small amount of data.

 

This TableInputSplit will be ignored if we enable 
spark.hadoopRDD.ignoreEmptySplits.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)