We've discussed thin client compute protocol with Pavel Tupitsyn and Igor Sapego and come to the conclusion that approach with two-way requests should be used: client generates taskId and send a request to the server to execute a task. The server responds that the request has been accepted. After task has finished the server notifies the client (send a request without waiting for a response). The client can cancel the task by sending a corresponding request to the server.
Also, a node list should be passed (optionally) with a request to limit nodes to execute the task. I will create IEP and file detailed protocol changes shortly. вт, 21 янв. 2020 г. в 18:46, Alex Plehanov <plehanov.a...@gmail.com>: > Igor, thanks for the reply. > > > Approach with taskId will require a lot of changes in protocol and thus > more "heavy" for implementation > Do you mean approach with server notifications mechanism? Yes, it will > require a lot of changes. But in most recent messages we've discussed with > Pavel approach without server notifications mechanism. This approach have > the same complexity and performance as an approach with requestId. > > > But such clients as Python, Node.js, PHP, Go most probably won't have > support for this API, at least for now. > Without a server notifications mechanism, there will be no breaking > changes in the protocol, so client implementation can just skip this > feature and protocol version and implement the next one. > > > Or never. > I think it still useful to execute java compute tasks from non-java thin > clients. Also, we can provide some out-of-the-box java tasks, for example > ExecutePythonScriptTask with python compute implementation, which can run > python script on server node. > > > So, maybe it's a good time for us to change our backward compatibility > mechanism from protocol versioning to feature masks? > I like the idea with feature masks, but it will force us to support both > backward compatibility mechanisms, protocol versioning and feature masks. > > пн, 20 янв. 2020 г. в 20:34, Pavel Tupitsyn <ptupit...@apache.org>: > >> Huge +1 from me for Feature Masks. >> I think this should be our top priority for thin client protocol, since it >> simplifies change management a lot. >> >> On Mon, Jan 20, 2020 at 8:21 PM Igor Sapego <isap...@apache.org> wrote: >> >> > Sorry for the late reply. >> > >> > Approach with taskId will require a lot of changes in protocol and thus >> > more "heavy" for implementation, but it definitely looks to me less >> hacky >> > than reqId-approach. Moreover, as was mentioned, server notifications >> > mechanism will be required in a future anyway with high probability. So >> > from this point of view I like taskId-approach. >> > >> > On the other hand, what we should also consider here is performance. >> > Speaking of latency, it looks like reqId will have better results in >> case >> > of >> > small and fast tasks. The only question here, if we want to optimize >> thin >> > clients for this case. >> > >> > Also, what are you talking about mostly involves clients on platforms >> > that already have Compute API for thick clients. Let me mention one >> > more point of view here and another concern here. >> > >> > The changes you propose are going to change protocol version for sure. >> > In case with taskId approach and server notifications - even more so. >> > >> > But such clients as Python, Node.js, PHP, Go most probably won't have >> > support for this API, at least for now. Or never. But current >> > backward-compatibility mechanism implies protocol versions where we >> > imply that client that supports version 1.5 also supports all the >> features >> > introduced in all the previous versions of the protocol. >> > >> > Thus implementing Compute API in any of the proposed ways *may* >> > force mentioned clients to support changes in protocol which they not >> > necessarily need in order to introduce new features in the future. >> > >> > So, maybe it's a good time for us to change our backward compatibility >> > mechanism from protocol versioning to feature masks? >> > >> > WDYT? >> > >> > Best Regards, >> > Igor >> > >> > >> > On Fri, Jan 17, 2020 at 9:37 AM Alex Plehanov <plehanov.a...@gmail.com> >> > wrote: >> > >> > > Looks like we didn't rich consensus here. >> > > >> > > Igor, as thin client maintainer, can you please share your opinion? >> > > >> > > Everyone else also welcome, please share your thoughts about options >> to >> > > implement operations for compute. >> > > >> > > >> > > чт, 28 нояб. 2019 г. в 10:02, Alex Plehanov <plehanov.a...@gmail.com >> >: >> > > >> > > > > Since all thin client operations are inherently async, we should >> be >> > > able >> > > > to cancel any of them >> > > > It's illogical to have such ability. What should do cancel >> operation of >> > > > cancel operation? Moreover, sometimes it's dangerous, for example, >> > create >> > > > cache operation should never be canceled. There should be an >> explicit >> > set >> > > > of processes that we can cancel: queries, transactions, tasks, >> > services. >> > > > The lifecycle of services is more complex than the lifecycle of >> tasks. >> > > With >> > > > services, I suppose, we can't use request cancelation, so tasks >> will be >> > > the >> > > > only process with an exceptional pattern. >> > > > >> > > > > The request would be "execute task with specified node filter" - >> > simple >> > > > and efficient. >> > > > It's not simple: every compute or service request should contain >> > complex >> > > > node filtering logic, which duplicates the same logic for cluster >> API. >> > > > It's not efficient: for example, we can't implement forPredicate() >> > > > filtering in this case. >> > > > >> > > > >> > > > ср, 27 нояб. 2019 г. в 19:25, Pavel Tupitsyn <ptupit...@apache.org >> >: >> > > > >> > > >> > The request is already processed (task is started), we can't >> cancel >> > > the >> > > >> request >> > > >> The request is not "start a task". It is "execute task" (and get >> > > result). >> > > >> Same as "cache get" - you get a result in the end, we don't "start >> > cache >> > > >> get" then "end cache get". >> > > >> >> > > >> Since all thin client operations are inherently async, we should be >> > able >> > > >> to >> > > >> cancel any of them >> > > >> by sending another request with an id of prior request to be >> > cancelled. >> > > >> That's why I'm advocating for this approach - it will work for >> > anything, >> > > >> no >> > > >> special cases. >> > > >> And it keeps "happy path" as simple as it is right now. >> > > >> >> > > >> Queries are different because we retrieve results in pages, we >> can't >> > do >> > > >> them as one request. >> > > >> Transactions are also different because client controls when they >> > should >> > > >> end. >> > > >> There is no reason for task execution to be a special case like >> > queries >> > > or >> > > >> transactions. >> > > >> >> > > >> > we always need to send 2 requests to server to execute the task >> > > >> Nope. We don't need to get nodes on client at all. >> > > >> The request would be "execute task with specified node filter" - >> > simple >> > > >> and >> > > >> efficient. >> > > >> >> > > >> >> > > >> On Wed, Nov 27, 2019 at 4:31 PM Alex Plehanov < >> > plehanov.a...@gmail.com> >> > > >> wrote: >> > > >> >> > > >> > > We do cancel a request to perform a task. We may and should >> use >> > > this >> > > >> to >> > > >> > cancel any other request in future. >> > > >> > The request is already processed (task is started), we can't >> cancel >> > > the >> > > >> > request. As you mentioned before, we already do almost the same >> for >> > > >> queries >> > > >> > (close the cursor, but not cancel the request to run a query), >> it's >> > > >> better >> > > >> > to do such things in a common way. We have a pattern: start some >> > > process >> > > >> > (query, transaction), get id of this process, end process by this >> > id. >> > > >> The >> > > >> > "Execute task" process should match the same pattern. In my >> opinion, >> > > >> > implementation with two-way requests is the best option to match >> > this >> > > >> > pattern (we can even reuse OP_RESOURCE_CLOSE operation type in >> this >> > > >> case). >> > > >> > Sometime in the future, we will need two-way requests for some >> other >> > > >> > functionality (continuous queries, event listening, etc). But >> even >> > > >> without >> > > >> > two-way requests introducing some process id (task id in our >> case) >> > > will >> > > >> be >> > > >> > closer to existing pattern than canceling tasks by request id. >> > > >> > >> > > >> > > So every new request will apply those filters on server side, >> > using >> > > >> the >> > > >> > most recent set of nodes. >> > > >> > In this case, we always need to send 2 requests to server to >> execute >> > > the >> > > >> > task. First - to get nodes by the filter, second - to actually >> > execute >> > > >> the >> > > >> > task. It seems like overhead. The same will be for services. >> Cluster >> > > >> group >> > > >> > remains the same if the topology hasn't changed. We can use this >> > fact >> > > >> and >> > > >> > bind "execute task" request to topology. If topology has changed >> - >> > get >> > > >> > nodes for new topology and retry request. >> > > >> > >> > > >> > вт, 26 нояб. 2019 г. в 17:44, Pavel Tupitsyn < >> ptupit...@apache.org >> > >: >> > > >> > >> > > >> > > > After all, we don't cancel request >> > > >> > > We do cancel a request to perform a task. We may and should use >> > this >> > > >> to >> > > >> > > cancel any other request in future. >> > > >> > > >> > > >> > > > Client uses some cluster group filtration (for example >> > > forServers() >> > > >> > > cluster group) >> > > >> > > Please see above - Aleksandr Shapkin described how we store >> > > >> > > filtered cluster groups on client. >> > > >> > > We don't store node IDs, we store actual filters. So every new >> > > request >> > > >> > will >> > > >> > > apply those filters on server side, >> > > >> > > using the most recent set of nodes. >> > > >> > > >> > > >> > > var myGrp = cluster.forServers().forAttribute("foo"); // This >> does >> > > not >> > > >> > > issue any server requests, just builds an object with filters >> on >> > > >> client >> > > >> > > while (true) myGrp.compute().executeTask("bar"); // Every >> request >> > > >> > includes >> > > >> > > filters, and filters are applied on the server side >> > > >> > > >> > > >> > > On Tue, Nov 26, 2019 at 1:42 PM Alex Plehanov < >> > > >> plehanov.a...@gmail.com> >> > > >> > > wrote: >> > > >> > > >> > > >> > > > > Anyway, my point stands. >> > > >> > > > I can't agree. Why you don't want to use task id for this? >> After >> > > >> all, >> > > >> > we >> > > >> > > > don't cancel request (request is already processed), we >> cancel >> > the >> > > >> > task. >> > > >> > > So >> > > >> > > > it's more convenient to use task id here. >> > > >> > > > >> > > >> > > > > Can you please provide equivalent use case with existing >> > "thick" >> > > >> > > client? >> > > >> > > > For example: >> > > >> > > > Cluster consists of one server node. >> > > >> > > > Client uses some cluster group filtration (for example >> > > forServers() >> > > >> > > cluster >> > > >> > > > group). >> > > >> > > > Client starts to send periodically (for example 1 per minute) >> > > >> long-term >> > > >> > > > (for example 1 hour long) tasks to the cluster. >> > > >> > > > Meanwhile, several server nodes joined the cluster. >> > > >> > > > >> > > >> > > > In case of thick client: All server nodes will be used, tasks >> > will >> > > >> be >> > > >> > > load >> > > >> > > > balanced. >> > > >> > > > In case of thin client: Only one server node will be used, >> > client >> > > >> will >> > > >> > > > detect topology change after an hour. >> > > >> > > > >> > > >> > > > >> > > >> > > > вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn < >> > > ptupit...@apache.org >> > > >> >: >> > > >> > > > >> > > >> > > > > > I can't see any usage of request id in query cursors >> > > >> > > > > You are right, cursor id is a separate thing. >> > > >> > > > > Anyway, my point stands. >> > > >> > > > > >> > > >> > > > > > client sends long term tasks to nodes and wants to do it >> > with >> > > >> load >> > > >> > > > > balancing >> > > >> > > > > I still don't get it. Can you please provide equivalent use >> > case >> > > >> with >> > > >> > > > > existing "thick" client? >> > > >> > > > > >> > > >> > > > > >> > > >> > > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov < >> > > >> > > plehanov.a...@gmail.com> >> > > >> > > > > wrote: >> > > >> > > > > >> > > >> > > > > > > And it is fine to use request ID to identify compute >> tasks >> > > >> (as we >> > > >> > > do >> > > >> > > > > with >> > > >> > > > > > query cursors). >> > > >> > > > > > I can't see any usage of request id in query cursors. We >> > send >> > > >> query >> > > >> > > > > request >> > > >> > > > > > and get cursor id in response. After that, we only use >> > cursor >> > > id >> > > >> > (to >> > > >> > > > get >> > > >> > > > > > next pages and to close the resource). Did I miss >> something? >> > > >> > > > > > >> > > >> > > > > > > Looks like I'm missing something - how is topology >> change >> > > >> > relevant >> > > >> > > to >> > > >> > > > > > executing compute tasks from client? >> > > >> > > > > > It's not relevant directly. But there are some cases >> where >> > it >> > > >> will >> > > >> > be >> > > >> > > > > > helpful. For example, if client sends long term tasks to >> > nodes >> > > >> and >> > > >> > > > wants >> > > >> > > > > to >> > > >> > > > > > do it with load balancing it will detect topology change >> > only >> > > >> after >> > > >> > > > some >> > > >> > > > > > time in the future with the first response, so load >> > balancing >> > > >> will >> > > >> > no >> > > >> > > > > work. >> > > >> > > > > > Perhaps we can add optional "topology version" field to >> the >> > > >> > > > > > OP_COMPUTE_EXECUTE_TASK request to solve this problem. >> > > >> > > > > > >> > > >> > > > > > >> > > >> > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn < >> > > >> ptupit...@apache.org >> > > >> > >: >> > > >> > > > > > >> > > >> > > > > > > Alex, >> > > >> > > > > > > >> > > >> > > > > > > > we will mix entities from different layers (transport >> > > layer >> > > >> and >> > > >> > > > > request >> > > >> > > > > > > body) >> > > >> > > > > > > I would not call our message header (which includes the >> > id) >> > > >> > > > "transport >> > > >> > > > > > > layer". >> > > >> > > > > > > TCP is our transport layer. And it is fine to use >> request >> > ID >> > > >> to >> > > >> > > > > identify >> > > >> > > > > > > compute tasks (as we do with query cursors). >> > > >> > > > > > > >> > > >> > > > > > > > we still can't be sure that the task is successfully >> > > started >> > > >> > on a >> > > >> > > > > > server >> > > >> > > > > > > The request to start the task will fail and we'll get a >> > > >> response >> > > >> > > > > > indicating >> > > >> > > > > > > that right away >> > > >> > > > > > > >> > > >> > > > > > > > we won't ever know about topology change >> > > >> > > > > > > Looks like I'm missing something - how is topology >> change >> > > >> > relevant >> > > >> > > to >> > > >> > > > > > > executing compute tasks from client? >> > > >> > > > > > > >> > > >> > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < >> > > >> > > > > plehanov.a...@gmail.com> >> > > >> > > > > > > wrote: >> > > >> > > > > > > >> > > >> > > > > > > > Pavel, in this case, we will mix entities from >> different >> > > >> layers >> > > >> > > > > > > (transport >> > > >> > > > > > > > layer and request body), it's not very good. The same >> > > >> behavior >> > > >> > we >> > > >> > > > can >> > > >> > > > > > > > achieve with generated on client-side task id, but >> there >> > > >> will >> > > >> > be >> > > >> > > no >> > > >> > > > > > > > inter-layer data intersection and I think it will be >> > > easier >> > > >> to >> > > >> > > > > > implement >> > > >> > > > > > > on >> > > >> > > > > > > > both client and server-side. But we still can't be >> sure >> > > that >> > > >> > the >> > > >> > > > task >> > > >> > > > > > is >> > > >> > > > > > > > successfully started on a server. We won't ever know >> > about >> > > >> > > topology >> > > >> > > > > > > change, >> > > >> > > > > > > > because topology changed flag will be sent from >> server >> > to >> > > >> > client >> > > >> > > > only >> > > >> > > > > > > with >> > > >> > > > > > > > a response when the task will be completed. Are we >> > accept >> > > >> that? >> > > >> > > > > > > > >> > > >> > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn < >> > > >> > > ptupit...@apache.org >> > > >> > > > >: >> > > >> > > > > > > > >> > > >> > > > > > > > > Alex, >> > > >> > > > > > > > > >> > > >> > > > > > > > > I have a simpler idea. We already do request id >> > handling >> > > >> in >> > > >> > the >> > > >> > > > > > > protocol, >> > > >> > > > > > > > > so: >> > > >> > > > > > > > > - Client sends a normal request to execute compute >> > task. >> > > >> > > Request >> > > >> > > > ID >> > > >> > > > > > is >> > > >> > > > > > > > > generated as usual. >> > > >> > > > > > > > > - As soon as task is completed, a response is >> > received. >> > > >> > > > > > > > > >> > > >> > > > > > > > > As for cancellation - client can send a new request >> > > (with >> > > >> new >> > > >> > > > > request >> > > >> > > > > > > ID) >> > > >> > > > > > > > > and (in the body) pass the request ID from above >> > > >> > > > > > > > > as a task identifier. As a result, there are two >> > > >> responses: >> > > >> > > > > > > > > - Cancellation response >> > > >> > > > > > > > > - Task response (with proper cancelled status) >> > > >> > > > > > > > > >> > > >> > > > > > > > > That's it, no need to modify the core of the >> protocol. >> > > One >> > > >> > > > request >> > > >> > > > > - >> > > >> > > > > > > one >> > > >> > > > > > > > > response. >> > > >> > > > > > > > > >> > > >> > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < >> > > >> > > > > > plehanov.a...@gmail.com >> > > >> > > > > > > > >> > > >> > > > > > > > > wrote: >> > > >> > > > > > > > > >> > > >> > > > > > > > > > Pavel, we need to inform the client when the >> task is >> > > >> > > completed, >> > > >> > > > > we >> > > >> > > > > > > need >> > > >> > > > > > > > > the >> > > >> > > > > > > > > > ability to cancel the task. I see several ways to >> > > >> implement >> > > >> > > > this: >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > 1. Сlient sends a request to the server to start >> a >> > > task, >> > > >> > > server >> > > >> > > > > > > return >> > > >> > > > > > > > > task >> > > >> > > > > > > > > > id in response. Server notifies client when task >> is >> > > >> > completed >> > > >> > > > > with >> > > >> > > > > > a >> > > >> > > > > > > > new >> > > >> > > > > > > > > > request (from server to client). Client can >> cancel >> > the >> > > >> task >> > > >> > > by >> > > >> > > > > > > sending >> > > >> > > > > > > > a >> > > >> > > > > > > > > > new request with operation type "cancel" and task >> > id. >> > > In >> > > >> > this >> > > >> > > > > case, >> > > >> > > > > > > we >> > > >> > > > > > > > > > should implement 2-ways requests. >> > > >> > > > > > > > > > 2. Client generates unique task id and sends a >> > request >> > > >> to >> > > >> > the >> > > >> > > > > > server >> > > >> > > > > > > to >> > > >> > > > > > > > > > start a task, server don't reply immediately but >> > wait >> > > >> until >> > > >> > > > task >> > > >> > > > > is >> > > >> > > > > > > > > > completed. Client can cancel task by sending new >> > > request >> > > >> > with >> > > >> > > > > > > operation >> > > >> > > > > > > > > > type "cancel" and task id. In this case, we >> should >> > > >> decouple >> > > >> > > > > request >> > > >> > > > > > > and >> > > >> > > > > > > > > > response on the server-side (currently response >> is >> > > sent >> > > >> > right >> > > >> > > > > after >> > > >> > > > > > > > > request >> > > >> > > > > > > > > > was processed). Also, we can't be sure that task >> is >> > > >> > > > successfully >> > > >> > > > > > > > started >> > > >> > > > > > > > > on >> > > >> > > > > > > > > > a server. >> > > >> > > > > > > > > > 3. Client sends a request to the server to start >> a >> > > task, >> > > >> > > server >> > > >> > > > > > > return >> > > >> > > > > > > > id >> > > >> > > > > > > > > > in response. Client periodically asks the server >> > about >> > > >> task >> > > >> > > > > status. >> > > >> > > > > > > > > Client >> > > >> > > > > > > > > > can cancel the task by sending new request with >> > > >> operation >> > > >> > > type >> > > >> > > > > > > "cancel" >> > > >> > > > > > > > > and >> > > >> > > > > > > > > > task id. This case brings some overhead to the >> > > >> > communication >> > > >> > > > > > channel. >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > Personally, I think that the case with 2-ways >> > requests >> > > >> is >> > > >> > > > better, >> > > >> > > > > > but >> > > >> > > > > > > > I'm >> > > >> > > > > > > > > > open to any other ideas. >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > Aleksandr, >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS >> > > looks >> > > >> > > > > > > > overcomplicated. >> > > >> > > > > > > > > Do >> > > >> > > > > > > > > > we need server-side filtering at all? Wouldn't >> it be >> > > >> better >> > > >> > > to >> > > >> > > > > send >> > > >> > > > > > > > basic >> > > >> > > > > > > > > > info (ids, order, flags) for all nodes (there is >> > > >> relatively >> > > >> > > > small >> > > >> > > > > > > > amount >> > > >> > > > > > > > > of >> > > >> > > > > > > > > > data) and extended info (attributes) for selected >> > list >> > > >> of >> > > >> > > > nodes? >> > > >> > > > > In >> > > >> > > > > > > > this >> > > >> > > > > > > > > > case, we can do basic node filtration on >> client-side >> > > >> > > > > (forClients(), >> > > >> > > > > > > > > > forServers(), forNodeIds(), forOthers(), etc). >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > Do you use standard ClusterNode serialization? >> There >> > > are >> > > >> > also >> > > >> > > > > > metrics >> > > >> > > > > > > > > > serialized with ClusterNode, do we need it on >> thin >> > > >> client? >> > > >> > > > There >> > > >> > > > > > are >> > > >> > > > > > > > > other >> > > >> > > > > > > > > > interfaces exist to show metrics, I think it's >> > > >> redundant to >> > > >> > > > > export >> > > >> > > > > > > > > metrics >> > > >> > > > > > > > > > to thin clients too. >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > What do you think? >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin < >> > > >> > > > > lexw...@gmail.com >> > > >> > > > > > >: >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > > Alex, >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > I think you can create a new IEP page and I >> will >> > > fill >> > > >> it >> > > >> > > with >> > > >> > > > > the >> > > >> > > > > > > > > Cluster >> > > >> > > > > > > > > > > API details. >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > In short, I’ve introduced several new codes: >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > Cluster API is pretty straightforward: >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > Cluster group codes: >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > The underlying implementation is based on the >> > thick >> > > >> > client >> > > >> > > > > logic. >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > For every request, we provide a known topology >> > > version >> > > >> > and >> > > >> > > if >> > > >> > > > > it >> > > >> > > > > > > has >> > > >> > > > > > > > > > > changed, >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > a client updates it firstly and then re-sends >> the >> > > >> > filtering >> > > >> > > > > > > request. >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > Alongside the topVer a client sends a >> serialized >> > > nodes >> > > >> > > > > projection >> > > >> > > > > > > > > object >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > that could be considered as a code to value >> > mapping. >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, >> > > >> “MyAttribute”}, >> > > >> > > > > {Code=2, >> > > >> > > > > > > > > > Value=1}] >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > Where “1” stands for Attribute filtering and >> “2” – >> > > >> > > > > > serverNodesOnly >> > > >> > > > > > > > > flag. >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > As a result of request processing, a server >> sends >> > > >> nodeId >> > > >> > > > UUIDs >> > > >> > > > > > and >> > > >> > > > > > > a >> > > >> > > > > > > > > > > current topVer. >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > When a client obtains nodeIds, it can perform a >> > > >> NODE_INFO >> > > >> > > > call >> > > >> > > > > to >> > > >> > > > > > > > get a >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > serialized ClusterNode object. In addition >> there >> > > >> should >> > > >> > be >> > > >> > > a >> > > >> > > > > > > > different >> > > >> > > > > > > > > > API >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > method for accessing/updating node metrics. >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov < >> > > >> > > > > > skoz...@gridgain.com >> > > >> > > > > > > >: >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > > Hi Pavel >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel >> Tupitsyn >> > < >> > > >> > > > > > > > > ptupit...@apache.org> >> > > >> > > > > > > > > > > > wrote: >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > > 1. I believe that Cluster operations for >> Thin >> > > >> Client >> > > >> > > > > protocol >> > > >> > > > > > > are >> > > >> > > > > > > > > > > already >> > > >> > > > > > > > > > > > > in the works >> > > >> > > > > > > > > > > > > by Alexandr Shapkin. Can't find the ticket >> > > though. >> > > >> > > > > > > > > > > > > Alexandr, can you please confirm and attach >> > the >> > > >> > ticket >> > > >> > > > > > number? >> > > >> > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > 2. Proposed changes will work only for Java >> > > tasks >> > > >> > that >> > > >> > > > are >> > > >> > > > > > > > already >> > > >> > > > > > > > > > > > deployed >> > > >> > > > > > > > > > > > > on server nodes. >> > > >> > > > > > > > > > > > > This is mostly useless for other thin >> clients >> > we >> > > >> have >> > > >> > > > > > (Python, >> > > >> > > > > > > > PHP, >> > > >> > > > > > > > > > > .NET, >> > > >> > > > > > > > > > > > > C++). >> > > >> > > > > > > > > > > > > >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > I don't guess so. The task (execution) is a >> way >> > to >> > > >> > > > implement >> > > >> > > > > > own >> > > >> > > > > > > > > layer >> > > >> > > > > > > > > > > for >> > > >> > > > > > > > > > > > the thin client application. >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > > We should think of a way to make this >> useful >> > for >> > > >> all >> > > >> > > > > clients. >> > > >> > > > > > > > > > > > > For example, we may allow sending tasks in >> > some >> > > >> > > scripting >> > > >> > > > > > > > language >> > > >> > > > > > > > > > like >> > > >> > > > > > > > > > > > > Javascript. >> > > >> > > > > > > > > > > > > Thoughts? >> > > >> > > > > > > > > > > > > >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > The arbitrary code execution from a remote >> > client >> > > >> must >> > > >> > be >> > > >> > > > > > > protected >> > > >> > > > > > > > > > > > from malicious code. >> > > >> > > > > > > > > > > > I don't know how it could be designed but >> > without >> > > >> that >> > > >> > we >> > > >> > > > > open >> > > >> > > > > > > the >> > > >> > > > > > > > > hole >> > > >> > > > > > > > > > > to >> > > >> > > > > > > > > > > > kill cluster. >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey >> > Kozlov < >> > > >> > > > > > > > > skoz...@gridgain.com >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > > > wrote: >> > > >> > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > Hi Alex >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > The idea is great. But I have some >> concerns >> > > that >> > > >> > > > probably >> > > >> > > > > > > > should >> > > >> > > > > > > > > be >> > > >> > > > > > > > > > > > taken >> > > >> > > > > > > > > > > > > > into account for design: >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > 1. We need to have the ability to >> stop a >> > > task >> > > >> > > > > execution, >> > > >> > > > > > > > smth >> > > >> > > > > > > > > > like >> > > >> > > > > > > > > > > > > > OP_COMPUTE_CANCEL_TASK operation >> (client >> > > to >> > > >> > > server) >> > > >> > > > > > > > > > > > > > 2. What's about task execution >> timeout? >> > It >> > > >> may >> > > >> > > help >> > > >> > > > to >> > > >> > > > > > the >> > > >> > > > > > > > > > cluster >> > > >> > > > > > > > > > > > > > survival for buggy tasks >> > > >> > > > > > > > > > > > > > 3. Ignite doesn't have >> > roles/authorization >> > > >> > > > > functionality >> > > >> > > > > > > for >> > > >> > > > > > > > > > now. >> > > >> > > > > > > > > > > > But >> > > >> > > > > > > > > > > > > a >> > > >> > > > > > > > > > > > > > task is the risky operation for >> cluster >> > > (for >> > > >> > > > security >> > > >> > > > > > > > > reasons). >> > > >> > > > > > > > > > > > Could >> > > >> > > > > > > > > > > > > we >> > > >> > > > > > > > > > > > > > add for Ignite configuration new >> options: >> > > >> > > > > > > > > > > > > > - Explicit turning on for compute >> task >> > > >> > support >> > > >> > > > for >> > > >> > > > > > thin >> > > >> > > > > > > > > > > protocol >> > > >> > > > > > > > > > > > > > (disabled by default) for whole >> > cluster >> > > >> > > > > > > > > > > > > > - Explicit turning on for compute >> task >> > > >> > support >> > > >> > > > for >> > > >> > > > > a >> > > >> > > > > > > node >> > > >> > > > > > > > > > > > > > - The list of task names (classes) >> > > >> allowed to >> > > >> > > > > execute >> > > >> > > > > > > by >> > > >> > > > > > > > > thin >> > > >> > > > > > > > > > > > > client. >> > > >> > > > > > > > > > > > > > 4. Support the labeling for task that >> may >> > > >> help >> > > >> > to >> > > >> > > > > > > > investigate >> > > >> > > > > > > > > > > issues >> > > >> > > > > > > > > > > > > on >> > > >> > > > > > > > > > > > > > cluster (the idea from IEP-34 [1]) >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > 1. >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > >> > > >> > > > > > > > > >> > > >> > > > > > > > >> > > >> > > > > > > >> > > >> > > > > > >> > > >> > > > > >> > > >> > > > >> > > >> > > >> > > >> > >> > > >> >> > > >> > >> https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex >> > > Plehanov < >> > > >> > > > > > > > > > > > plehanov.a...@gmail.com> >> > > >> > > > > > > > > > > > > > wrote: >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > Hello, Igniters! >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > I have plans to start implementation of >> > > >> Compute >> > > >> > > > > interface >> > > >> > > > > > > for >> > > >> > > > > > > > > > > Ignite >> > > >> > > > > > > > > > > > > thin >> > > >> > > > > > > > > > > > > > > client and want to discuss features >> that >> > > >> should >> > > >> > be >> > > >> > > > > > > > implemented. >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > We already have Compute implementation >> for >> > > >> > > > binary-rest >> > > >> > > > > > > > clients >> > > >> > > > > > > > > > > > > > > (GridClientCompute), which have the >> > > following >> > > >> > > > > > > functionality: >> > > >> > > > > > > > > > > > > > > - Filtering cluster nodes (projection) >> for >> > > >> > compute >> > > >> > > > > > > > > > > > > > > - Executing task by the name >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > I think we can implement this >> > functionality >> > > >> in a >> > > >> > > thin >> > > >> > > > > > > client >> > > >> > > > > > > > as >> > > >> > > > > > > > > > > well. >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > First of all, we need some operation >> types >> > > to >> > > >> > > > request a >> > > >> > > > > > > list >> > > >> > > > > > > > of >> > > >> > > > > > > > > > all >> > > >> > > > > > > > > > > > > > > available nodes and probably node >> > attributes >> > > >> (by >> > > >> > a >> > > >> > > > list >> > > >> > > > > > of >> > > >> > > > > > > > > > nodes). >> > > >> > > > > > > > > > > > Node >> > > >> > > > > > > > > > > > > > > attributes will be helpful if we will >> > decide >> > > >> to >> > > >> > > > > implement >> > > >> > > > > > > > > analog >> > > >> > > > > > > > > > of >> > > >> > > > > > > > > > > > > > > ClusterGroup#forAttribute or >> > > >> > > > ClusterGroup#forePredicate >> > > >> > > > > > > > methods >> > > >> > > > > > > > > > in >> > > >> > > > > > > > > > > > the >> > > >> > > > > > > > > > > > > > thin >> > > >> > > > > > > > > > > > > > > client. Perhaps they can be requested >> > > lazily. >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > From the protocol point of view there >> will >> > > be >> > > >> two >> > > >> > > new >> > > >> > > > > > > > > operations: >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES >> > > >> > > > > > > > > > > > > > > Request: empty >> > > >> > > > > > > > > > > > > > > Response: long topologyVersion, int >> > > >> > > > > minorTopologyVersion, >> > > >> > > > > > > int >> > > >> > > > > > > > > > > > > nodesCount, >> > > >> > > > > > > > > > > > > > > for each node set of node fields (UUID >> > > nodeId, >> > > >> > > Object >> > > >> > > > > or >> > > >> > > > > > > > String >> > > >> > > > > > > > > > > > > > > consistentId, long order, etc) >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES >> > > >> > > > > > > > > > > > > > > Request: int nodesCount, for each node: >> > UUID >> > > >> > nodeId >> > > >> > > > > > > > > > > > > > > Response: int nodesCount, for each >> node: >> > int >> > > >> > > > > > > attributesCount, >> > > >> > > > > > > > > for >> > > >> > > > > > > > > > > > each >> > > >> > > > > > > > > > > > > > node >> > > >> > > > > > > > > > > > > > > attribute: String name, Object value >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > To execute tasks we need something like >> > > these >> > > >> > > methods >> > > >> > > > > in >> > > >> > > > > > > the >> > > >> > > > > > > > > > client >> > > >> > > > > > > > > > > > > API: >> > > >> > > > > > > > > > > > > > > Object execute(String task, Object arg) >> > > >> > > > > > > > > > > > > > > Future<Object> executeAsync(String >> task, >> > > >> Object >> > > >> > > arg) >> > > >> > > > > > > > > > > > > > > Object affinityExecute(String task, >> String >> > > >> cache, >> > > >> > > > > Object >> > > >> > > > > > > key, >> > > >> > > > > > > > > > > Object >> > > >> > > > > > > > > > > > > arg) >> > > >> > > > > > > > > > > > > > > Future<Object> >> affinityExecuteAsync(String >> > > >> task, >> > > >> > > > String >> > > >> > > > > > > > cache, >> > > >> > > > > > > > > > > Object >> > > >> > > > > > > > > > > > > > key, >> > > >> > > > > > > > > > > > > > > Object arg) >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > Which can be mapped to protocol >> > operations: >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK >> > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, >> > > Object >> > > >> arg >> > > >> > > > > > > > > > > > > > > Response: Object result >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY >> > > >> > > > > > > > > > > > > > > Request: String cacheName, Object key, >> > > String >> > > >> > > > taskName, >> > > >> > > > > > > > Object >> > > >> > > > > > > > > > arg >> > > >> > > > > > > > > > > > > > > Response: Object result >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > The second operation is needed because >> we >> > > >> > sometimes >> > > >> > > > > can't >> > > >> > > > > > > > > > calculate >> > > >> > > > > > > > > > > > and >> > > >> > > > > > > > > > > > > > > connect to affinity node on the >> > client-side >> > > >> > > (affinity >> > > >> > > > > > > > awareness >> > > >> > > > > > > > > > can >> > > >> > > > > > > > > > > > be >> > > >> > > > > > > > > > > > > > > disabled, custom affinity function can >> be >> > > >> used or >> > > >> > > > there >> > > >> > > > > > can >> > > >> > > > > > > > be >> > > >> > > > > > > > > no >> > > >> > > > > > > > > > > > > > > connection between client and affinity >> > > node), >> > > >> but >> > > >> > > we >> > > >> > > > > can >> > > >> > > > > > > make >> > > >> > > > > > > > > > best >> > > >> > > > > > > > > > > > > effort >> > > >> > > > > > > > > > > > > > > to send request to target node if >> affinity >> > > >> > > awareness >> > > >> > > > is >> > > >> > > > > > > > > enabled. >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > Currently, on the server-side requests >> > > always >> > > >> > > > processed >> > > >> > > > > > > > > > > synchronously >> > > >> > > > > > > > > > > > > and >> > > >> > > > > > > > > > > > > > > responses are sent right after request >> was >> > > >> > > processed. >> > > >> > > > > To >> > > >> > > > > > > > > execute >> > > >> > > > > > > > > > > long >> > > >> > > > > > > > > > > > > > tasks >> > > >> > > > > > > > > > > > > > > async we should whether change this >> logic >> > or >> > > >> > > > introduce >> > > >> > > > > > some >> > > >> > > > > > > > > kind >> > > >> > > > > > > > > > > > > two-way >> > > >> > > > > > > > > > > > > > > communication between client and server >> > (now >> > > >> only >> > > >> > > > > one-way >> > > >> > > > > > > > > > requests >> > > >> > > > > > > > > > > > from >> > > >> > > > > > > > > > > > > > > client to server are allowed). >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > Two-way communication can also be >> useful >> > in >> > > >> the >> > > >> > > > future >> > > >> > > > > if >> > > >> > > > > > > we >> > > >> > > > > > > > > will >> > > >> > > > > > > > > > > > send >> > > >> > > > > > > > > > > > > > some >> > > >> > > > > > > > > > > > > > > server-side generated events to >> clients. >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > In case of two-way communication there >> can >> > > be >> > > >> new >> > > >> > > > > > > operations >> > > >> > > > > > > > > > > > > introduced: >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to >> > > >> server) >> > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, >> > > Object >> > > >> arg >> > > >> > > > > > > > > > > > > > > Response: long taskId >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server >> to >> > > >> client) >> > > >> > > > > > > > > > > > > > > Request: taskId, Object result >> > > >> > > > > > > > > > > > > > > Response: empty >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > The same for affinity requests. >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > Also, we can implement not only execute >> > task >> > > >> > > > operation, >> > > >> > > > > > but >> > > >> > > > > > > > > some >> > > >> > > > > > > > > > > > other >> > > >> > > > > > > > > > > > > > > operations from IgniteCompute >> (broadcast, >> > > run, >> > > >> > > call), >> > > >> > > > > but >> > > >> > > > > > > it >> > > >> > > > > > > > > will >> > > >> > > > > > > > > > > be >> > > >> > > > > > > > > > > > > > useful >> > > >> > > > > > > > > > > > > > > only for java thin client. And even >> with >> > > java >> > > >> > thin >> > > >> > > > > client >> > > >> > > > > > > we >> > > >> > > > > > > > > > should >> > > >> > > > > > > > > > > > > > whether >> > > >> > > > > > > > > > > > > > > implement peer-class-loading for thin >> > > clients >> > > >> > (this >> > > >> > > > > also >> > > >> > > > > > > > > requires >> > > >> > > > > > > > > > > > > two-way >> > > >> > > > > > > > > > > > > > > client-server communication) or put >> > classes >> > > >> with >> > > >> > > > > executed >> > > >> > > > > > > > > > closures >> > > >> > > > > > > > > > > to >> > > >> > > > > > > > > > > > > the >> > > >> > > > > > > > > > > > > > > server locally. >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > What do you think about proposed >> protocol >> > > >> > changes? >> > > >> > > > > > > > > > > > > > > Do we need two-way requests between >> client >> > > and >> > > >> > > > server? >> > > >> > > > > > > > > > > > > > > Do we need support of compute methods >> > other >> > > >> than >> > > >> > > > > "execute >> > > >> > > > > > > > > task"? >> > > >> > > > > > > > > > > > > > > What do you think about >> peer-class-loading >> > > for >> > > >> > thin >> > > >> > > > > > > clients? >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > -- >> > > >> > > > > > > > > > > > > > Sergey Kozlov >> > > >> > > > > > > > > > > > > > GridGain Systems >> > > >> > > > > > > > > > > > > > www.gridgain.com >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > -- >> > > >> > > > > > > > > > > > Sergey Kozlov >> > > >> > > > > > > > > > > > GridGain Systems >> > > >> > > > > > > > > > > > www.gridgain.com >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > -- >> > > >> > > > > > > > > > > Alex. >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > >> > > >> > > > > > > > > >> > > >> > > > > > > > >> > > >> > > > > > > >> > > >> > > > > > >> > > >> > > > > >> > > >> > > > >> > > >> > > >> > > >> > >> > > >> >> > > > >> > > >> > >> >