Hello, 

I am new to Apache Ignite and come from a Data Warehousing background. So
pardon if I try to relate to Ignite through DBMS jargon. I already went
through some of the posts on the forum but I am still unclear about some of
the basics. 

1.) CacheMode=PARTITIONED 
     - When I just declare a cache as partitioned, I understand that data is
equally distributed across all 
     nodes. Is there an option to provide a "partition key" based on which
the data would be distributed 
     across the nodes (in which case, the skewness of distribution would
depend on choice of Partition Key? 
      
     - Without an Affinity Key, when I load data (using loadCache()) into a
partitioned cache, will all source 
       rows be sent to all nodes on my cluster? 

2.) Affinity 
     I understand that the concept of affinity is to use a key that
distinctly identifies the node on which the 
     data may reside. 

     - When I partition the cache and define an affinity key, is the data
partitioned based 
     on the Affinity Key itself? If not, how does affinity differ from
partitioning? 
      
     - With an Affinity Key defined, when I load data (using loadCache())
into a partitioned cache, will the 
       source rows be sent to the node they belong to or all the nodes on my
cluster? 

3.) When I create an index for a cache...does it distribute the data
automatically (without defining any 
      Affinity Key or Partition Key) ? 

SCNEARIO DESCRIPTION 
----------------------------- 
I want to load data from a persistent layer into a Staging Cache (assume
~2B) using loadCache(). 
The cache resides on a 4 node cluster. 
a.) Can I load data in such a way that each node has to process only 0.5B
records? 
     Is that using Partitioned Cache mode and defining an Affinity Key? 

Then I want to read transactions from the Staging Cache in TRANSACTIONAL
atomicity mode, lookup a Target Cache and do some operations. 
b.) When I do the lookup on Target Cache, how can I ensure that the lookup
is happening only on the node where the data resides and not do lookup on
all the nodes on which Target Cache resides? 
Would that be using the Affinity Key? If yes, how? 

c.) Lets say I wanted to do a lookup on a key other than Affinity Key
column, can creating an index on the lookup column help? Would I end up
scanning all nodes in that case? 

Staging Cache 
CustomerID 
CustomerEmail 
CustomerPhone       

Target Cache 
Seq_Num 
CustomerID 
CustomerEmail 
CustomerPhone 
StartDate 
EndDate 




--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Understanding-Cache-Key-Indexes-Partition-and-Affinity-tp11212.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Reply via email to