I created https://github.com/apache/accumulo/pull/3134 and that should fix this issue.
Thanks Vincent for pointing out the issue in TServerUtils, your testing made this easy to track down and fix. On Sat, Dec 17, 2022 at 9:00 AM Christopher Shannon < christopher.l.shan...@gmail.com> wrote: > I was able to reproduce the issue by setting the value size for a mutation > to size 16384001 to make sure it's greater than the default value for > Thrift and it fails immediately. I will work on a fix now that we know how > to reproduce it. > > On Fri, Dec 16, 2022 at 2:31 PM Christopher <ctubb...@apache.org> wrote: > >> I don't think it's intentional. This might be the source of the problem. >> >> On Thu, Dec 15, 2022 at 3:39 PM Vincent Russell >> <vincent.russ...@gmail.com> wrote: >> > >> > Also in TserverUtils:270, when the TNonblockingServerSocket is created >> it >> > looks like it ends up using the default frame size. I am not sure if >> this >> > is intentional or not. >> > >> > On Thu, Dec 15, 2022 at 3:26 PM Vincent Russell < >> vincent.russ...@gmail.com> >> > wrote: >> > >> > > Christopher, >> > > >> > > I am not sure if this issue is related to 3042 or not. >> > > >> > > On the client side it does look like TConfiguration ends up being >> called >> > > with the default constructor. I am not sure if this is intentional >> or not. >> > > >> > > On the server side I see this stack, so it also looks like: >> > > >> > > at org.apache.thrift.TConfiguration.<init>(TConfiguration.java:36) >> > > at >> org.apache.thrift.TConfiguration$Builder.build(TConfiguration.java:99) >> > > at org.apache.thrift.TConfiguration.<clinit>(TConfiguration.java:65) >> > > at >> > > >> org.apache.thrift.transport.TNonblockingSocket.<init>(TNonblockingSocket.java:74) >> > > at >> > > >> org.apache.thrift.transport.TNonblockingSocket.<init>(TNonblockingSocket.java:68) >> > > at >> > > >> org.apache.thrift.transport.TNonblockingServerSocket.accept(TNonblockingServerSocket.java:135) >> > > at >> > > >> org.apache.thrift.transport.TNonblockingServerSocket.accept(TNonblockingServerSocket.java:36) >> > > at >> > > >> org.apache.thrift.server.TNonblockingServer$SelectAcceptThread.handleAccept(TNonblockingServer.java:218) >> > > at >> > > >> org.apache.thrift.server.TNonblockingServer$SelectAcceptThread.select(TNonblockingServer.java:186) >> > > at >> > > >> org.apache.thrift.server.TNonblockingServer$SelectAcceptThread.run(TNonblockingServer.java:142) >> > > >> > > >> > > I see this in the server log so it does look like it should be using >> 1G: >> > > 2022-09-01 16:59:41 INFO [org.apache.accumulo.tserver.TabletServer] >> > > ServerUtil:124 - general.server.message.size.max = 1G >> > > >> > > Thanks, >> > > Vincent >> > > >> > > On Thu, Dec 15, 2022 at 12:26 PM Vincent Russell < >> > > vincent.russ...@gmail.com> wrote: >> > > >> > >> >> > >> >> > >> I had to make a stack trace with hacking together a remote debug >> instance: >> > >> >> > >> at >> > >> >> org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.read(AbstractNonblockingServer.java:334) >> > >> at >> > >> >> org.apache.accumulo.server.rpc.CustomNonBlockingServer$CustomFrameBuffer.read(CustomNonBlockingServer.java:134) >> > >> at >> > >> >> org.apache.thrift.server.AbstractNonblockingServer$AbstractSelectThread.handleRead(AbstractNonblockingServer.java:187) >> > >> at >> > >> >> org.apache.thrift.server.TNonblockingServer$SelectAcceptThread.select(TNonblockingServer.java:189) >> > >> at >> > >> >> org.apache.thrift.server.TNonblockingServer$SelectAcceptThread.run(TNonblockingServer.java:142) >> > >> >> > >> On Thu, Dec 15, 2022 at 12:52 AM Christopher <ctubb...@apache.org> >> wrote: >> > >> >> > >>> From the numbers in the message, it looks like you're sending an >> 18MB >> > >>> payload but something in Thrift is limiting things to 16384000 >> > >>> (16000KB). I doubt you've overridden the default >> > >>> general.server.message.size.max to be anything that low (the default >> > >>> is 1G). Unless you're flushing after every mutation, it would not be >> > >>> surprising to exceed the 16MB max frame size indicated in the error >> > >>> message quite quickly. >> > >>> >> > >>> This value of 16384000 seemed weird. It looks like it's not using >> our >> > >>> configuration, but using the built-in default value of >> > >>> org.apache.thrift.TConfiguration.DEFAULT_MAX_FRAME_SIZE. It looks >> like >> > >>> this can happen whenever `new TConfiguration()` is called without >> > >>> parameters... and there's a fair amount of internal code, mostly in >> > >>> libthrift itself, that does that. It's a bit tricky to track down >> the >> > >>> one causing this particular issue. If you have a full stack trace, >> it >> > >>> could help. >> > >>> >> > >>> Also, this might be the same issue seen reported in >> > >>> https://github.com/apache/accumulo/issues/3042 >> > >>> >> > >>> On Wed, Dec 14, 2022 at 8:53 PM Vincent Russell >> > >>> <vincent.russ...@gmail.com> wrote: >> > >>> > >> > >>> > I was able to work out all of my compilation issues; however when >> I >> > >>> run an >> > >>> > integration test with the Mini Accumulo Cluster that tests writing >> > >>> > mutations with values of 5mb the flush hangs forever >> > >>> > and I see the following logs in the TabletServer logs: >> > >>> > >> > >>> > 20:41:02.306 [Thread-7] ERROR >> > >>> > o.a.a.s.r.CustomNonBlockingServer$CustomFrameBuffer - Read a frame >> > >>> size of >> > >>> > 18874697, which is bigger than the maximum allowable frame size >> > >>> 16384000 >> > >>> > for ALL connections. >> > >>> > 20:41:03.582 [Thread-7] ERROR >> > >>> > o.a.a.s.r.CustomNonBlockingServer$CustomFrameBuffer - Read a frame >> > >>> size of >> > >>> > 18874697, which is bigger than the maximum allowable frame size >> > >>> 16384000 >> > >>> > for ALL connections. >> > >>> > 20:41:05.079 [Thread-7] ERROR >> > >>> > o.a.a.s.r.CustomNonBlockingServer$CustomFrameBuffer - Read a frame >> > >>> size of >> > >>> > 18874697, which is bigger than the maximum allowable frame size >> > >>> 16384000 >> > >>> > for ALL connections. >> > >>> > >> > >>> > Other tests that write smaller amounts of data appear to work >> fine. >> > >>> > >> > >>> > Any idea what the issue might be? >> > >>> > >> > >>> > Thank you, >> > >>> > Vincent >> > >>> > >> > >>> > On Tue, Dec 13, 2022 at 4:43 PM Vincent Russell < >> > >>> vincent.russ...@gmail.com> >> > >>> > wrote: >> > >>> > >> > >>> > > Thank you both for your responses. >> > >>> > > >> > >>> > > We are using an event store library from a sister project that >> was >> > >>> written >> > >>> > > for accumulo 1.10., which I have already upgraded to 2.0. >> > >>> > > >> > >>> > > I'll spend some time investigating how bad the usage of the >> internal >> > >>> > > packages are and get back to you. >> > >>> > > >> > >>> > > Thanks again, >> > >>> > > >> > >>> > > On Tue, Dec 13, 2022 at 3:20 PM Christopher < >> ctubb...@apache.org> >> > >>> wrote: >> > >>> > > >> > >>> > >> To add to Dave's answer, the public API is defined at >> > >>> > >> https://accumulo.apache.org/api/ >> > >>> > >> Anything else is not public and is subject to change without >> notice >> > >>> on >> > >>> > >> any release without any attempt to retain compatibility. >> > >>> > >> >> > >>> > >> On Tue, Dec 13, 2022 at 3:10 PM Dave Marion < >> dmario...@gmail.com> >> > >>> wrote: >> > >>> > >> > >> > >>> > >> > There is no guide. You are using implementation classes (see >> > >>> clientImpl >> > >>> > >> in >> > >>> > >> > the package name) vs. using the client api. If you can use >> the >> > >>> client >> > >>> > >> api >> > >>> > >> > directly, then this should insulate you from changes in the >> future >> > >>> > >> (except >> > >>> > >> > during major versions). We can try and find where things >> might >> > >>> have >> > >>> > >> moved, >> > >>> > >> > but a class may have been split into multiple pieces. If you >> could >> > >>> > >> provide >> > >>> > >> > class and method, that would be easier. >> > >>> > >> > >> > >>> > >> > On Tue, Dec 13, 2022 at 2:45 PM Vincent Russell < >> > >>> > >> vincent.russ...@gmail.com> >> > >>> > >> > wrote: >> > >>> > >> > >> > >>> > >> > > Is there a guide that shows where classes may have been >> moved >> > >>> with >> > >>> > >> moving >> > >>> > >> > > from 2.0 to 2.1? For instance, I am having issues >> compiling, >> > >>> because >> > >>> > >> the >> > >>> > >> > > following class doesn't exist: >> > >>> > >> > > import org.apache.accumulo.core.clientImpl.Tables; >> > >>> > >> > > >> > >>> > >> > > I'm just getting started so I'm sure there are others. >> > >>> > >> > > >> > >>> > >> > > Thanks, >> > >>> > >> > > Vincent >> > >>> > >> > > >> > >>> > >> > > >> > >>> > >> > > >> > >>> > >> > > On Fri, Dec 9, 2022 at 9:02 AM Vincent Russell < >> > >>> > >> vincent.russ...@gmail.com> >> > >>> > >> > > wrote: >> > >>> > >> > > >> > >>> > >> > > > I mean Christopher. >> > >>> > >> > > > >> > >>> > >> > > > Thanks again. >> > >>> > >> > > > >> > >>> > >> > > > On Fri, Dec 9, 2022 at 9:01 AM Vincent Russell < >> > >>> > >> > > vincent.russ...@gmail.com> >> > >>> > >> > > > wrote: >> > >>> > >> > > > >> > >>> > >> > > >> Thank you Chris. >> > >>> > >> > > >> >> > >>> > >> > > >> Will will upgrade to Accumulo 2.1 and ZooKeeper 3.7 or >> > >>> later as >> > >>> > >> soon as >> > >>> > >> > > >> possible. >> > >>> > >> > > >> >> > >>> > >> > > >> On Thu, Dec 8, 2022 at 8:44 PM Christopher < >> > >>> ctubb...@apache.org> >> > >>> > >> wrote: >> > >>> > >> > > >> >> > >>> > >> > > >>> Hi Vincent, >> > >>> > >> > > >>> >> > >>> > >> > > >>> Version 2.0.1 is end of life as of the 2.1.0 LTM >> release, >> > >>> and 2.0 >> > >>> > >> is >> > >>> > >> > > >>> not expected to receive any further updates. Version >> 2.1.0 >> > >>> may >> > >>> > >> work >> > >>> > >> > > >>> with ZooKeeper 3.4, but was developed and tested >> against >> > >>> 3.5 and >> > >>> > >> later >> > >>> > >> > > >>> versions. I believe the ZooKeeper community is >> currently >> > >>> > >> considering >> > >>> > >> > > >>> whether to make 3.6 end-of-life themselves, so I would >> > >>> recommend >> > >>> > >> using >> > >>> > >> > > >>> Accumulo 2.1.0 with the latest ZooKeeper 3.7 or later >> to >> > >>> have the >> > >>> > >> best >> > >>> > >> > > >>> chance of any kind of support, including JDK 17 >> support. >> > >>> > >> > > >>> >> > >>> > >> > > >>> As for your specific issues: >> > >>> > >> > > >>> >> > >>> > >> > > >>> 1. This is already fixed in 2.1.0 >> > >>> > >> > > >>> 2/3. These issues are likely fixed in newer ZooKeeper >> > >>> versions. I >> > >>> > >> > > >>> haven't seen them anytime recently, anyway. Bugs in >> > >>> ZooKeeper >> > >>> > >> itself >> > >>> > >> > > >>> are out of scope for the Accumulo developers, but I >> have >> > >>> tried >> > >>> > >> > > >>> building Accumulo 2.1.0 with JDK 17 and ZooKeeper >> 3.8.0 and >> > >>> > >> haven't >> > >>> > >> > > >>> observed any unresolved issues. However, it's >> difficult to >> > >>> > >> actually >> > >>> > >> > > >>> run it because I don't think Hadoop has good JDK 17 >> support >> > >>> yet. >> > >>> > >> So, >> > >>> > >> > > >>> MiniAccumuloCluster seems to work with JDK 17, as does >> > >>> Accumulo >> > >>> > >> and ZK >> > >>> > >> > > >>> 3.8, but I don't think a full Hadoop cluster would >> (yet). >> > >>> > >> > > >>> >> > >>> > >> > > >>> On Thu, Dec 8, 2022 at 12:28 PM Vincent Russell >> > >>> > >> > > >>> <vincent.russ...@gmail.com> wrote: >> > >>> > >> > > >>> > >> > >>> > >> > > >>> > Hello, >> > >>> > >> > > >>> > >> > >>> > >> > > >>> > We are currently using accumulo 2.0.1. >> > >>> > >> > > >>> > >> > >>> > >> > > >>> > We are in the process of upgrading our source code >> to use >> > >>> jdk 17 >> > >>> > >> > > >>> however we >> > >>> > >> > > >>> > are running into some problems with our tests and the >> > >>> > >> > > >>> MiniAccumuloCluster. >> > >>> > >> > > >>> > >> > >>> > >> > > >>> > One of our developer encountered the following >> issues: >> > >>> > >> > > >>> > >> > >>> > >> > > >>> > 1. The MiniAccumumluoClusterImpl._exec is >> hardcoded >> > >>> with the >> > >>> > >> JVM >> > >>> > >> > > arg >> > >>> > >> > > >>> > -XX:+IUseConcMarkSweepGC, which is no longer >> tolerated >> > >>> with >> > >>> > >> JDK17. >> > >>> > >> > > >>> > 2. In Zookeeper 3.4.14, ConetStringParser uses >> > >>> > >> createUnresolved to >> > >>> > >> > > >>> > make IPAddresses. >> > >>> > >> > > >>> SaslServerPrincipal.WrapperInetSocketAddress.getAddress >> > >>> > >> > > >>> > uses InetSocketAddess.getAddress, which returns >> null >> > >>> because >> > >>> > >> it's >> > >>> > >> > > >>> not >> > >>> > >> > > >>> > resolved, resulting in a failure to connect to the >> > >>> > >> newly-started >> > >>> > >> > > >>> zookeeper. >> > >>> > >> > > >>> > 3. StaticHostProvider.getHostString() tries to >> extract >> > >>> he >> > >>> > >> hostname >> > >>> > >> > > >>> by >> > >>> > >> > > >>> > calling toString on the address and taking >> everything >> > >>> before >> > >>> > >> the >> > >>> > >> > > >>> colon, but >> > >>> > >> > > >>> > in JDK17, the string format changed to >> > >>> > >> > > "localhost/<unresolved->:xx" >> > >>> > >> > > >>> (where >> > >>> > >> > > >>> > XX is still the port number). That's incorrect >> and it >> > >>> can't >> > >>> > >> > > >>> resolve the >> > >>> > >> > > >>> > names. >> > >>> > >> > > >>> > >> > >>> > >> > > >>> > >> > >>> > >> > > >>> > Has anyone come across/resolved these kinds of >> issues? >> > >>> Is it >> > >>> > >> not >> > >>> > >> > > >>> possible >> > >>> > >> > > >>> > to use java17 from a client perspective? Will >> upgrading >> > >>> to >> > >>> > >> accumulo >> > >>> > >> > > >>> 2.1 >> > >>> > >> > > >>> > help? >> > >>> > >> > > >>> > >> > >>> > >> > > >>> > Thanks, >> > >>> > >> > > >>> > Vincent >> > >>> > >> > > >>> >> > >>> > >> > > >> >> > >>> > >> > > >> > >>> > >> >> > >>> > > >> > >>> >> > >> >> >