Hi,

I'd like to propose an idea of creating new Ignite component for
integration with other platforms such as .Net, Ruby, NodeJS, etc.

Earlier in GridGain we had thin TCP clients for Java and .Net. They had
limited features and not-so-good performance (e.g. due to inability to
reliable map task to affinity node, etc.). For now Java client is in
open-source and is used only for internal purposes, and .Net client was
fully reworked to use JVM started in the same process instead of TCP and is
currently GridGain enterprise feature.

But as we see growing interest to the product it makes sense to expose some
native interfaces for easy integration with our product from any platform.

Let's discuss on how platforms integration architecture should be.

*1. JVM placement.*
One of the most important points is how native platform will communicate
with JVM with started node. There are number of approaches to consider:
- Start JVM in the same process. This will allow for fast communication
between JVM and the native platform. The drawback of this approach is that
we can start only one JVM per process. As a result this solution might not
work in some environments (especially development ones), e.g. app servers
when multiple native applications run in the same process and each
application want to start a node with different JVM properties, or
multi-process environments when there is a coordinator process which spawns
child processes with limited lifecycle on demand (Apache, IIS, NodeJS, etc).
- Connect to JVM using some IPC mechanism (shared memory, pipes). This
approach might be a bit slower than the first one due to IPC overhead, but
still pretty fast. To implement it we probably will have to create some
intermediate management application which will start nodes in different
processes and provide handles for native application to connect with them.
This approach will be more flexible than the first one.
- Connect to JVM using TCP. This will be the slowest one, but offer even
greater flexibility, as we will be able to transaprently connect to nodes
even on another hosts. However, this raises some failover questions.

In summary, I think we should choose "JVM in the same process" approach as
we already have experience with it and it is prooved to be functional and
performant, but create careful abstraction (facade) for node communication
logic, so that shmem/pipes/tcp approaches can be implemented easily if
needed without distirbing other components.

*2. Data transfer and serialization.*
Another important point - how to pass data between Java and non-Java
platforms. Obviously we will have to provide some common format for both
interacting platforms, so that data serialized on one side could be
deserialized on another if needed.
For JVM-in-the-same-proc approach it make sense to organize data transfer
over offheap memory. Earlier we experimented with more sophisticated
mechanisms like "pin Java heap array in native platform -> write directly
to that array -> unpin", but this approach have some serious problems (like
JVM intrinsic method hangs while array is pinned), while not providing
significant perofrmance benefit.
So I think data transfer over offheap will be enough as this is simple and
reliable solution with acceptable performance.
Also we must remember that platforms may potentially have different
mechanisms for data transfer. E.g., sometimes we have to marshal object to
bytes before passing it to Java, sometimes we may just pass a pointer (e.g.
structs in C or .Net with known layout), etc.. We should be able to
potentially support all these cases

In summary I propose to use offheap as a default implementation, while
still leaving a room for changing this if needed. E.g. instead of passing
offheap pointer + data length:

void invokeOtherPlatform(long dataPointer, int dataLen);

we should design it as:

void invokeOtherPlatform(long pointer);

where pointer will encode all information required for another platform to
read the data. E.g. it can be a pointer to memory region where the first 4
bytes are data length and the rest are serialzied object.

*3. Queries support*
Queries is one of the most demanded features of the product. But at the
moment it can only work with Java objects because it uses Java
serialization to get fields from it.
We will have to provide user a way to alter it somehow so that objects from
native platforms are supported as well.
Good candidate for this is IgniteCacheObjectProcessor interface which is
responsible for objects serialization.
We will have to investigate what should be done to let it's implementation
(either default or some custom) work with objects from other platforms.

*4. Extensibility*
We will have a set of C/C++ interfaces exposing basic features (e.g. cache,
compute, queries, etc.).
But as we do not know in advance what implementors will want to do apart
from regular Java methods, it make sense to leave some extensibility
points. At the very first glance they may look as follows:

interface Cache {
    void get(void* inData, void* outData); // Regular cache operation.
    bool put(void* outData); // Another regular cache operation.
    ...
    void invoke(int operationType, void* inData, void* outData); //
Extensibility point.
}

In this example we define "invoke" method where use may pass virtually
anything. So, when some new functionallity is required he will implement it
in Java and inject it into Ignite somehow (e.g. through config) and
implement it in native platform. But he WILL NOT have to change any Ignite
C interfaces and rebuild them.

*5. Configuration.*
Last, but not least - how to configure Ignite on other platforms. Currently
the only way to do that is Spring XML. This approach works well for Java
developers, but is not so good for others, because a developer who is not
familiar with Java/Spring will have to learn quite a bit things about them.
E.g. try configuring HashMap in Spring with an int key/value :-) Non-java
developers will have hard time doing this.
So probably we will have to let users use native mechanisms of their
platforms for configuration. This is not really critical from features
perspective, but will significantly improve user experience.

Please share your thoughs and ideas about that.

Vladimir.

Reply via email to