Re: Question on Key choice for an Ignite cache

2020-06-30 Thread Eugene McGowan
On Tue 30 Jun 2020, 13:51 Ilya Kasnacheev, 
wrote:

> Hello!
>
> You can also use Binary Object (POJO) as a key, i.e., you can use an
> Object with String email and String phone fields.
>
> I don't recommend mangling strings further than concatenation. String key
> is no worse than numeric key.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> вт, 30 июн. 2020 г. в 15:17, Eugene McGowan :
>
>>
>> We would like to create an Ignite key by concatenating data. This is a
>> standard distributed system pattern for key-value, and would allow the
>> reader and writer consistently access the cache.  The data is a combination
>> of strings and integers. To simplify our use case, lets say its an email
>> address (f...@bar.com) and phone number (123444) we want to use to build
>> our key.Our key could therefore be:foo@bar.com_123444
>> The advantage of this approach is the key can easily
>> be read/debugged. Is there a more optimum format for Ignite though? For
>> regular RDBMS, it seems integers were the default choice. We could convert
>> f...@bar.com to an int, e.g. f converts to 6, o to 15, etc.
>> This naïve first attempt of conversion would of-course lead to clashes,
>> as 111 could map to either aaa or ak. This could be worked around
>> potentially, so looking for an initial steer on what Ignite would prefer as
>> a key (i.e. strings or ints). Is hashing something that would be
>> recommended either? In terms of any partitioning type logic, we are
>> guessing not, but just more around creating deterministic, unique keys
>>
>


Question on Key choice for an Ignite cache

2020-06-30 Thread Eugene McGowan
We would like to create an Ignite key by concatenating data. This is a
standard distributed system pattern for key-value, and would allow the
reader and writer consistently access the cache.  The data is a combination
of strings and integers. To simplify our use case, lets say its an email
address (f...@bar.com) and phone number (123444) we want to use to build our
key.Our key could therefore be:foo@bar.com_123444 The
advantage of this approach is the key can easily be read/debugged. Is there
a more optimum format for Ignite though? For regular RDBMS, it seems
integers were the default choice. We could convert f...@bar.com to an int,
e.g. f converts to 6, o to 15, etc.
This naïve first attempt of conversion would of-course lead to clashes, as
111 could map to either aaa or ak. This could be worked around potentially,
so looking for an initial steer on what Ignite would prefer as a key (i.e.
strings or ints). Is hashing something that would be recommended either? In
terms of any partitioning type logic, we are guessing not, but just more
around creating deterministic, unique keys


Re: Ignite concepts

2020-06-12 Thread Eugene McGowan
Hi Dennis ,

Again thanks for you feedback and links these are very helpful.

Also it looks like prior to 2.8.0 the Java thin Client was thread safe but
single threaded , but since 2.8.0 the Java thin client is Multithreaded, (I
have a book that mentioned it was not multithread, which confused me).
Is this correct and why you suggest I only need one connection per service?

As a rule of thumb is it better more preformanent to use gets and puts as
opposed to SQL , only use the SQL API if you need to do something that
requires SQL?

Regarding the 3 different caches problem , I did a bit of research on this
and think I am good with this essentially by not using the try and
autoclose  with the igniteClient I can control when the igniteClient object
is closed and therefore use it for all 3 Cache checks.

Regards Eugene

On Thu, Jun 11, 2020 at 10:57 PM Denis Magda  wrote:

> Eugene,
>
> In such a case, I would go for the thin client for several reasons:
>
>- With the thick client (aka. standard client), you will come across
>several limitations if the servers are deployed on VMs, but the thick
>clients are running in a K8S environment. Those limitations are to be
>addressed soon [1].
>- With the partition-awareness feature [2], the thin client will be
>able to send requests to those server nodes that store the primary copy of
>a requested record. The functionality should be released in Ignite 2.9 and
>already exists in GridGain.
>
> I don't think that you need to be bothered about a connection pool. Just
> have a single thin client connection per service instead and enable the
> partition-awareness. If you don't use the latter feature, then ensure that
> the thin clients on those service instances are connected to random proxy
> servers (you need to pass different IPs of the server nodes in the
> connection string parameter [3]).
>
>  One thing I am not clear on is in the scenario where we need to check 3
>> different caches, for a single call to our service, is  this going to be an
>> expensive operation?
>>  Will we have to initiate a connection to Ignite for every call to the
>> cache?
>
>
> To help with this, I need to understand your business use case better.
> There is a chance you can do all the checks with one network hop.
>
> [1]
> http://apache-ignite-developers.2346864.n4.nabble.com/DISCUSSION-New-Ignite-settings-for-IGNITE-12438-and-IGNITE-13013-td47586.html
> [2]
> https://www.gridgain.com/docs/latest/developers-guide/thin-clients/getting-started-with-thin-clients#partition-awareness
> [3]
> https://apacheignite.readme.io/docs/java-thin-client-initialization-and-configuration#initialization
>
> -
> Denis
>
>
> On Thu, Jun 11, 2020 at 12:12 AM Eugene McGowan 
> wrote:
>
>> Thanks for getting back to me on this Dennis,
>> The company's  Ignite Cluster is already up and running on VMs on our
>> company network. The service my team is standing up will run in our Private
>> cloud K8S environment.
>>
>> Regarding our usecase we will have say 2 or 3 caches, our service will be
>> passed 3 parameters , if it gets a match to parameter 1 in cache 1 no
>> further action is required , if not it needs to check cache 2 for parameter
>> 2  ,again if it gets a match it uses data from cache 2 and no further
>> action is required , otherwise it needs to check cache 3 . So cache 1 has
>> most relevant data.
>> We would have somewhere between 2 and 10 instances of our service running
>> servicing a max of 3000+ request per second
>>
>> Our preference if possible would be to use a thin client. One thing I am
>> not clear on is in the scenario where we need to check 3 different caches,
>> for a single call to our service, is  this going to be an expensive
>> operation?
>>  Will we have to initiate a connection to Ignite for every call to the
>> cache?  From reading  connection pooling does not come out of the box with
>> key value operations over thin client, but its possible to implement your
>> own connection pooling, is it therefore possible to have long lived
>> connections to avoid the expense of constantly connecting and disconnecting
>> from Ignite caches.
>>
>> Regarding the possibility of using the SQL API the background that is:
>>   (1) We have a rarely used usecase where we want to get a
>> combination of data from all 3 caches , so was thinking the SQL API would
>> be one way of doing that.
>>   (2) Looking at the docs it seems that connection pooling is
>> possible via JDBC over thin client , so was wondering would that  be a more
>> performant approach, if it's not possible to have a connection pool for key
>> value operatio

Re: Ignite concepts

2020-06-11 Thread Eugene McGowan
Thanks for getting back to me on this Dennis,
The company's  Ignite Cluster is already up and running on VMs on our
company network. The service my team is standing up will run in our Private
cloud K8S environment.

Regarding our usecase we will have say 2 or 3 caches, our service will be
passed 3 parameters , if it gets a match to parameter 1 in cache 1 no
further action is required , if not it needs to check cache 2 for parameter
2  ,again if it gets a match it uses data from cache 2 and no further
action is required , otherwise it needs to check cache 3 . So cache 1 has
most relevant data.
We would have somewhere between 2 and 10 instances of our service running
servicing a max of 3000+ request per second

Our preference if possible would be to use a thin client. One thing I am
not clear on is in the scenario where we need to check 3 different caches,
for a single call to our service, is  this going to be an expensive
operation?
 Will we have to initiate a connection to Ignite for every call to the
cache?  From reading  connection pooling does not come out of the box with
key value operations over thin client, but its possible to implement your
own connection pooling, is it therefore possible to have long lived
connections to avoid the expense of constantly connecting and disconnecting
from Ignite caches.

Regarding the possibility of using the SQL API the background that is:
  (1) We have a rarely used usecase where we want to get a combination
of data from all 3 caches , so was thinking the SQL API would be one way of
doing that.
  (2) Looking at the docs it seems that connection pooling is possible
via JDBC over thin client , so was wondering would that  be a more
performant approach, if it's not possible to have a connection pool for key
value operations over the thin client.

Regards Eugene

On Thu, Jun 11, 2020 at 12:28 AM Denis Magda  wrote:

> Hi Eugene,
>
> Let me help you with that as much as I can. Please help me understand the
> following:
>
>- Is the cluster supposed to be deployed outside of Kubernetes (on VMs
>or bare-metal)? Is this a private cloud (OpenShift, PCF) or a public
>environment (Azure, AWS, etc.)?
>- If you use primary keys to access the records, do you really need
>SQL for this type of operations? Or do you perform additional filtering of
>records with SQL operands?
>
> In the meantime, please check this page to understand the pros and cons of
> thick and thin clients:
> https://www.gridgain.com/docs/latest/installation-guide/deployment-modes#thick-vs-thin-clients
>
> -
> Denis
>
>
> On Wed, Jun 10, 2020 at 1:07 PM Eugene McGowan 
> wrote:
>
>> Hi Igniters,
>> Our company has recently started using Ignite and our team are now to it.
>> We would like to use Ignite as a Cache layer for an upcoming project
>>
>> Our use case is that we have an Ignite cluster that we want to interact
>> with , our client service is Springboot based and  running in a Kubernetes
>> Cluster  but not co located with our Ignite cluster , we anticipate our
>> service will receive *3000+ request per second* , the service will scale
>> based on the volume of requests, and we will need to query Ignite for each
>> of these requests.
>> The data we need to query is fairly straight forward , a single row from
>>  either 1 or 2  caches based on primary key.
>> We are at the point where we are beginning to design the architecture.
>>
>> I have done some initial reading on Ignite and was wondering if someone
>> could
>> Verify my understanding is correct.
>>
>> I would also be interested to know  what architectures  other folks are
>> using for similar use cases
>>
>> *===*
>> *Thin Client*
>> 
>>   JDBC connection are not thread safe
>> https://apacheignite-sql.readme.io/docs/jdbc-driver
>>   To use this approach we need to ensure non cuncurrent use of
>> Connections, Statements, and ResultSet perhaps using a locking mechanism.
>>
>>   Key value operations or SQL Queries via the API over the thin client
>> are thread safe
>>   Connection Pooling is not provided via the API , but it is possible to
>> implement your own connection pool.
>>   If you implement your own connection pool you could keep connections
>> alive to reduce
>>   the cost of establishing a connection to Ignite and reusing
>>   the connections for various requests.
>>
>>
>> ==
>> *Thick Client *
>> ==
>>
>>   Does not have a concept of Connection pooling
>>
>>   When using either JDBC or the IGNITE API on the Thick Client
>>   It is thread safe and can handle multiple concurrent threads
>>
>>   Is part of the Ignite cluster

Ignite concepts

2020-06-10 Thread Eugene McGowan
Hi Igniters,
Our company has recently started using Ignite and our team are now to it.
We would like to use Ignite as a Cache layer for an upcoming project

Our use case is that we have an Ignite cluster that we want to interact with , 
our client service is Springboot based and  running in a Kubernetes Cluster  
but not co located with our Ignite cluster , we anticipate our service will 
receive 3000+ request per second , the service will scale based on the volume 
of requests, and we will need to query Ignite for each of these requests.
The data we need to query is fairly straight forward , a single row from  
either 1 or 2  caches based on primary key. 
We are at the point where we are beginning to design the architecture.

I have done some initial reading on Ignite and was wondering if someone could 
Verify my understanding is correct. 

I would also be interested to know  what architectures  other folks are using 
for similar use cases 

===
Thin Client 

  JDBC connection are not thread safe 
https://apacheignite-sql.readme.io/docs/jdbc-driver 

  To use this approach we need to ensure non cuncurrent use of Connections, 
Statements, and ResultSet perhaps using a locking mechanism. 

  Key value operations or SQL Queries via the API over the thin client are 
thread safe
  Connection Pooling is not provided via the API , but it is possible to 
implement your own connection pool.
  If you implement your own connection pool you could keep connections alive to 
reduce
  the cost of establishing a connection to Ignite and reusing
  the connections for various requests.
  
==
Thick Client 
==

  Does not have a concept of Connection pooling

  When using either JDBC or the IGNITE API on the Thick Client 
  It is thread safe and can handle multiple concurrent threads

  Is part of the Ignite cluster
  Ignite Servers can iniatate TCP connection  needs to be reachable via a TCP 
connection 
  If the Client is behind a firewall,NAT or Load Balancer it makes using ,thick 
client challenging

 

=
Question on GRPC
=

The Thin Client TCP socket connection uses a Binary protocol
 https://apacheignite.readme.io/docs/binary-client-protocol 


I may be mixing up network levels here but , is it possible to run this over a 
GRPC connection