Hi guys!

> Do I understand correctly, that mostly affected are the users that are
currently using Ignite on aarch64 (and some other, rare architectures)?

Yes, that's correct. Only by those users.

> What are those, can you please provide examples?

"aarch64", for example, that's the architecture of modern MacBooks.

> However, I think that providing a log storage conversion tool along with
the release may be a good idea from the UX standpoint

That makes sense, I'll create a Jira for this tool and describe it, thank
you!

чт, 24 июл. 2025 г. в 10:04, Pavel Tupitsyn <ptupit...@apache.org>:

> > Little Endian architectures that are NOT
> >  included in the following list: "i386", "x86", "amd64", and "x86_64"
>
> What are those, can you please provide examples?
>
> On Wed, Jul 23, 2025 at 9:50 PM Aleksandr Polovtsev
> <alexpolovt...@gmail.com> wrote:
> >
> > Thank you, Ivan, for the detailed explanation.
> > Do I understand correctly, that mostly affected are the users that are
> > currently using Ignite on aarch64 (and some other, rare architectures)?
> > If yes, then your proposal makes sense, Ignite is mostly targeted at the
> > servers which usually run on x86_64.
> > However, I think that providing a log storage conversion tool along with
> > the release may be a good idea from the UX standpoint. Because if I'm an
> > affected user, I would expect to run this script straight away and not
> > write to the dev list and wait for the script to be delivered.
> >
> > On Wed, Jul 23, 2025 at 5:00 PM Ivan Bessonov <bessonov...@gmail.com>
> wrote:
> >
> > > Hello, Igniters!
> > >
> > > Recently we encountered an unexpected issue. Let me start with its
> roots,
> > > before I start
> > > discussing potential fixes.
> > >
> > > We noticed that certain benchmarks showed some inefficiencies when
> being
> > > run on new
> > > MacBooks. They were related to low-level serialization code, and the
> cause
> > > of it was an
> > > unaligned read in GridUnsafe. "aarch64" allows it, but the
> architecture is
> > > not included in the
> > > "GridUnsafe#unaligned" check, which resulted in the execution of
> fall-back
> > > code that reads
> > > and writes everything byte by byte.
> > >
> > > The fix seemed trivial, and we did it in [1] by adding "aarch64" into
> the
> > > list of architectures that
> > > support unaligned memory access. After a while, when we enabled the
> > > "ItCompatibilityTest#testCompatibility", we realized that
> compatibility on
> > > MacBooks is broken.
> > > The incompatibility has been caused by [1], and as a hotfix, it has
> been
> > > temporarily reverted
> > > in [2].
> > >
> > > How was that possible?
> > > When we finished the investigation, it turned out
> > > "DirectByteBufferStreamImplV1#writeUuid"
> > > and "DirectByteBufferStreamImplV1#readUuid" have a particularly nasty
> bug
> > > in them. This is
> > > how these methods behave in 3.0:
> > >  - If we run on an "i386", "x86", "amd64", or "x86_64", we will write
> parts
> > > of UUID in Big Endian.
> > >  - If we run on other Little Endian architectures, we will write these
> > > parts in Little Endian.
> > >  - If we run on a Big Endian architecture, we will write these parts
> in Big
> > > Endian.
> > >
> > > When we added "aarch64" to the list of "unaligned" architectures, we
> > > started treating its data
> > > as BE in "main" while Ignite 3.0 treats it as LE. For the
> clarification -
> > > this stream is used for
> > > - Network communication, runtime only.
> > > - Serialization of raft commands, this data is written to the storage.
> > > That's why fix [1] broke compatibility.
> > >
> > > Such a behavior constitutes a problem, because network protocol and
> raft
> > > serialization must be
> > > architecture-independent:
> > > - It is possible that nodes in the same cluster are run in different
> > > environments with different
> > >   architectures.
> > > - It is possible, and almost guaranteed, that raft command
> serialization
> > > happens on a different
> > >   node, and thus must also be architecture-independent.
> > >   (node A does the serialization, node B writes resulted payload into
> the
> > > log storage)
> > >
> > > That's issue number 1. The issue number 2 was found when we inspected
> the
> > > code of
> > > "DirectByteBufferStreamImplV1". "writeFixedInt"/"readFixedInt" (long
> too)
> > > methods parity
> > > is violated in BE architectures. Writes are always LE, but read uses
> native
> > > bytes ordering.
> > >
> > > In other words, Ignite 3.0 doesn't really work on Big Endian
> architectures.
> > > Fixing this place
> > > in particular is trivial, we will do it in 3.1. Fixing broken Little
> Endian
> > > architectures might not
> > > be as trivial.
> > >
> > > My proposal is the following:
> > > - We fix the bug in UUID serialization, and always use Big Endian for
> > > encoding there. This
> > >   will make our protocols correct on all architectures at once.
> > >   This fix will break backwards compatibility on Little Endian
> > > architectures that are NOT
> > >   included in the following list: "i386", "x86", "amd64", and "x86_64".
> > >   This means that an upgrade from 3.0 to 3.1 will be impossible*.
> > > - We add "aarch64" into the list of architectures that support
> unaligned
> > > memory access.
> > > - We explicitly disable "ItCompatibilityTest#testCompatibility" on a
> number
> > > of architectures.
> > > - * If it turns out that we have a user, who uses one of those
> > > architectures and who must
> > >   upgrade their cluster from 3.0, we will prepare and provide a log
> storage
> > > conversion tool
> > >   that will replace all Little Endian UUIDs to Big Endian format. As
> far as
> > > I'm aware, only log
> > >   storage is affected at the moment.
> > >
> > > It's better to fix it in 3.1, because it will be more widely adopted
> than
> > > 3.0. I will do that in [3].
> > > Please provide your feedback to the proposal. What are your thoughts?
> Thank
> > > you!
> > >
> > > [1] https://issues.apache.org/jira/browse/IGNITE-25564
> > > [2] https://issues.apache.org/jira/browse/IGNITE-25796
> > > [3] https://issues.apache.org/jira/browse/IGNITE-25797
> > >
> > > --
> > > Sincerely yours,
> > > Ivan Bessonov
> > >
> >
> >
> > --
> > With regards,
> > Aleksandr Polovtsev
>


-- 
Sincerely yours,
Ivan Bessonov

Reply via email to