Re: Geode data colocation partitioned regions

Udo Kohlmeyer Thu, 16 Jul 2020 10:26:40 -0700

Hi there Ashish,

I think it is safe to assume that once you change the PartitionResolver 
strategy that you might have to reload the data.


I will not commit to a definitive, “Yes, you have to reload the data and cannot 
load it again from disk” answer, but I think that answer will become 
self-evident when you change the region configuration, as some settings on the 
region cannot be amended after creation.

I don’t know if you have considered this yet, but it sounds like you have some 
“complex” string key, that you try and parse for the common. Have you consider 
maybe using an Object like

public class ComplexKey implements DataSerializable {
  private String commonPartitioningKey;
  private String key;

  public ComplexKey() {}

  public ComplexKey(String commonPartitioningKey, String key) {
    this.commonPartitioningKey = commonPartitioningKey;
    this.key = key;
  }

  @Override
  public int hashCode() {
    return key.hashCode();
  }

  @Override
  public boolean equals(Object obj) {
    return this.key.equals(((ComplexKey) obj).key);
  }

  public Object getCommonPartitioningKey() {
    return commonPartitioningKey;
  }

  public void setCommonPartitioningKey(String commonPartitioningKey) {
    this.commonPartitioningKey = commonPartitioningKey;
  }

  public Object getKey() {
    return key;
  }

  public void setKey(String key) {
    this.key = key;
  }

  @Override
  public void toData(DataOutput out) throws IOException {
    out.writeUTF(commonPartitioningKey);
    out.writeUTF(key);
  }

  @Override
  public void fromData(DataInput in) throws IOException, ClassNotFoundException 
{
    commonPartitioningKey = in.readUTF();
    key = in.readUTF();
  }
}

Where you can still do a get using the natural key of the object but the 
PartitionResolver can partition according to the partitioningKey. Imo, it just 
cleanly separates the partitioning and natural key logic.

BE AWARE, you should not use PDX serialization for keys, so stick to 
Serializable or DataSerializable.

As for functions. You should see no difference. Colocation just means that the 
same bucket number of colocated regions are stored on the same server. What you 
can now use, is you the notion of “local” data across colocated regions and 
don’t need to go across the network if you need to access colocated data. So 
possibly functions can run using local data only and don’t need to go across a 
network if they need data from another region. I might improve performance a 
little.

Anyway, lots of information. Reach out if you get stuck or don’t understand 
something.

—Udo
On Jul 16, 2020, 9:38 AM -0700, aashish choudhary 
<[email protected]>, wrote:
Hi,

We are seeing some performance issue with partitioned regions as when we 
execute data aware function then some of the calls to other regions inside 
functions goes to different nodes for further processing. So we are trying to 
implement data colocation between those regions.

We will be using custom partitioning of data by implementing PartitionResolver 
interface.

Questions

I believe we would need to import/export data again after creating regions with 
colocation. Please confirm.

Since we have regions with different key but all regions have first part of the 
key common(separated by _) so in partition resolver implementing class we just 
take the first of key for routing. Will this custom partition the data 
correctly?

Do we need to do any changes while reading data in functions after enabling 
data colocation?


With best regards,
Ashish

Re: Geode data colocation partitioned regions

Reply via email to