Hi,

I have userId as a key.
Many users have moderate amounts of data but some users have more and some 
users have huge amount of data.

I have been thinking about the following aspects of partitioning:

  1.  If two or more large users will fall into same partition I might end up 
with large partition/s (unbalanced with other partitions)
  2.  If smaller users fall in the same partition as a huge user the small 
users might get slower processing due to the amount of data the huge user has
  3.  If the order of the messages is not critical, maybe I would want to allow 
several consumers to work on the data of the same huge user, therefore I would 
like to partition one userId into several partitions

I have some ideas how to partition to solve those issues that but if you have 
something that worked well for you at production I would love to hear.
Also, any links to relevant blogposts/etc will be welcome

Thanks,
Victoria
-------------------------------------------
NOTICE:
This email and all attachments are confidential, may be proprietary, and may be 
privileged or otherwise protected from disclosure. They are intended solely for 
the individual or entity to whom the email is addressed. However, mistakes 
sometimes happen in addressing emails. If you believe that you are not an 
intended recipient, please stop reading immediately. Do not copy, forward, or 
rely on the contents in any way. Notify the sender and/or Imperva, Inc. by 
telephone at +1 (650) 832-6006 and then delete or destroy any copy of this 
email and its attachments. The sender reserves and asserts all rights to 
confidentiality, as well as any privileges that may apply. Any disclosure, 
copying, distribution or action taken or omitted to be taken by an unintended 
recipient in reliance on this message is prohibited and may be unlawful.
Please consider the environment before printing this email.

Reply via email to