Hi Ashish, I think (from just the perspective of not doing an unnecessary String op) a complex Key object would be better..
But, your way could work as well… I feel you might definitely hit a performance wall with the string manipulation… Also.. for every get, you will hit the resolver, to determine what bucket to route to… So once again, more performance problems. (And memory) —Udo On Jul 16, 2020, 12:01 PM -0700, aashish choudhary <[email protected]>, wrote: Thanks Udo for Your inputs . I am planning to do something like this in Partitionresolver implementing. Public class CustomPartitionResolver implements PartitionResolver<String,Object> { public Object getRoutingObject(EntryOperation opDetails) { String key =(String)opDetails.getKey(); return key.split(“_”)[0]; } } On Thu, 16 Jul 2020 at 10:56 PM, Udo Kohlmeyer <[email protected]<mailto:[email protected]>> wrote: Hi there Ashish, I think it is safe to assume that once you change the PartitionResolver strategy that you might have to reload the data. I will not commit to a definitive, “Yes, you have to reload the data and cannot load it again from disk” answer, but I think that answer will become self-evident when you change the region configuration, as some settings on the region cannot be amended after creation. I don’t know if you have considered this yet, but it sounds like you have some “complex” string key, that you try and parse for the common. Have you consider maybe using an Object like public class ComplexKey implements DataSerializable { private String commonPartitioningKey; private String key; public ComplexKey() {} public ComplexKey(String commonPartitioningKey, String key) { this.commonPartitioningKey = commonPartitioningKey; this.key = key; } @Override public int hashCode() { return key.hashCode(); } @Override public boolean equals(Object obj) { return this.key.equals(((ComplexKey) obj).key); } public Object getCommonPartitioningKey() { return commonPartitioningKey; } public void setCommonPartitioningKey(String commonPartitioningKey) { this.commonPartitioningKey = commonPartitioningKey; } public Object getKey() { return key; } public void setKey(String key) { this.key = key; } @Override public void toData(DataOutput out) throws IOException { out.writeUTF(commonPartitioningKey); out.writeUTF(key); } @Override public void fromData(DataInput in) throws IOException, ClassNotFoundException { commonPartitioningKey = in.readUTF(); key = in.readUTF(); } } Where you can still do a get using the natural key of the object but the PartitionResolver can partition according to the partitioningKey. Imo, it just cleanly separates the partitioning and natural key logic. BE AWARE, you should not use PDX serialization for keys, so stick to Serializable or DataSerializable. As for functions. You should see no difference. Colocation just means that the same bucket number of colocated regions are stored on the same server. What you can now use, is you the notion of “local” data across colocated regions and don’t need to go across the network if you need to access colocated data. So possibly functions can run using local data only and don’t need to go across a network if they need data from another region. I might improve performance a little. Anyway, lots of information. Reach out if you get stuck or don’t understand something. —Udo On Jul 16, 2020, 9:38 AM -0700, aashish choudhary <[email protected]<mailto:[email protected]>>, wrote: Hi, We are seeing some performance issue with partitioned regions as when we execute data aware function then some of the calls to other regions inside functions goes to different nodes for further processing. So we are trying to implement data colocation between those regions. We will be using custom partitioning of data by implementing PartitionResolver interface. Questions I believe we would need to import/export data again after creating regions with colocation. Please confirm. Since we have regions with different key but all regions have first part of the key common(separated by _) so in partition resolver implementing class we just take the first of key for routing. Will this custom partition the data correctly? Do we need to do any changes while reading data in functions after enabling data colocation? With best regards, Ashish -- With Best Regards, Ashish
