Hi,
Reduce will be done on node to which JDBC or thin client connected, it could
be either client or server node.
Thanks!
-Dmitry
--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/
Hi Dmitry,
Another question. I am curious if I query via JDBC or new thin client, will the reduce happen on one of server nodes serving as a proxy?
Sent: Thursday, June 28, 2018 at 2:02 PM
From: "Tom M"
To: user@ignite.apache.org
Subject: Re: Scaling with SQL query
Hi Dmitry,
Thanks for the great explanation!
Looks like "reduce" hapenning on the client is the issue that can be solved with adding clients.
Sent: Wednesday, June 27, 2018 at 6:22 PM
From: dkarachentsev
To: user@ignite.apache.org
Subject: Re: Scaling with SQL query
H
Hi Jose,
1. Yep, I would say, you'll get more profit in persistence. Because if you
split between real machines, each may keep more hot data in memory and each
has separate hard drive. The more data you can fit into RAM and more hard
drive could work in parallel, the better performance you get.
2
Hi Dmitry,
This is a fantastic explanation to better understand scaling strategies for
SQL - Thanks.
A couple of questions:
1. Do these mechanisms apply equally for persistent caches?
2. Regarding your point (2.) - How would one achieve this? (more clients?)
(more connections to node?) Are these
Hi,
Slight degradation is expected in some cases. Let me explain how it works.
1) Client sends request to each node (if you have query parallelism > 1 than
number of requests multiplied by that num).
2) Each node runs that query against it's local dataset.
3) Each node responses with 100 entries.
s shouldn't introduce much overhead. How will this query be processed? Sending requests to each node, looking up for data at each node's index and returning the data to the "reducer" node? Any inter-node data exchange involved?
Sent: Wednesday, June 27, 2018 at 1:36 PM
From: &q
Hi Tom,
In case of a replicated cache the Ignite plans the execution of the sql
query across whole cluster by splitting into multiple map queries and a
single reduce query.
Thus it is possible communication overheads caused by that the "reduce"
node collects data from multiple nodes.
Please show
Hi,
I have a cluster of 10 nodes, and a cache with replication factor 3 and no persistency enabled.
The SQL query is pretty simple -- "SELECT * FROM Logs ORDER by time DESC LIMIT 100".
I have checked the index for "time" attribute is applied.
When I increase the number of nodes, throughpu