AnandInguva commented on issue #26209:
URL: https://github.com/apache/beam/issues/26209#issuecomment-1882208113
From my experiments what i found out is that
if you unpickle the same payload into two different objects,
With Dill, both objects share different memory address
With Cloudpickle, they share same memory address.
in the `CombineFn` `with_fanout`, we have `lift_combiners` during pipeline
stage_creations, here I was inspection pipeline proto and found out that
Combine stages share same proto
For example, for Merge and ExtractOutput, this is the proto def
```
INFO:root:unique_name: "Do/CombinePerKey/CombinePerKey(PreCombineFn)/Merge"
spec {
urn: "beam:transform:combine_per_key_merge_accumulators:v1"
payload: "\n\260\013\n
beam:combinefn:pickled_python:v1\032\213\013\n\210\013QlpoOTFBWSZTWSejpnAAAvl/9P//1blC8Zv3n3/f5r////Zj5x4QFAAAAQQAUAO+uzXW27ruM7WdYSSQp4JCeg1NMR4hP1AjAJtTIwBBgjCDATEwGiGmGplGI0TChkaaAANPUNGgAAAAAAAAAABKaJE01TZo1NCbUeoPUPUAaeoNAAAAAAAAAeoHlDgABoAGgDQAAGjEAA0AAAAADQASmqaAFGaap+inqeUyaB6gGmJ6Ro2iDTRo0PUZAPU9IGmQACJNfGbYf9uAQYQQGEMLHYYh+sgH7Hmi/00ptIf8Auac02UitRt1Y5CWzcaWjZfGQ2NWnED4py8y6HLCYKLNWgmENmTDOIZDSk40REEQiHcfLIy4xkZwy1GEzH819JmdYQR4vgGJYwxYj03YDcEHuERBK1SGGOpoBHaC6F5GXXOR08euPzsLNQZC5WgHFK98logwMMPuc4ffc6oJpEQ/D933+//szND08iaOCGGOOLfCNgTMj1z61A+DBTQ5oMzK5exvY7nDyiThEqZ7J21YCdY6B6dZPgiJBGYRJIuPV1p1ttE7PLXceZcrgfcMoqdKZFBTEhviKyhoG31vUcHOtE4FLZQiRjb7vkUBCfPP6rgqvNIw6xosiMYwVUQXLrdUIDJRYyOFO7PrdVsQ6ae+9NAqx15YOagHQjOErBQaeZaEwSwcYdWggYmSlgVzT5YglzAygT6tgIkCLuGzVcIyeW3YefuQ4ERQiZ7cIigzX4Te1/mw1dL2ksFzS1Z9Yxfsas01JNesykI08fMTJDeZIHFwD2qyD1stverBUM3NzqWf7kX8qOFk2k7gYRK05qnq6Uhoy7PB9tQPf8rTu+CIi/clKTu+ExFGPDiTl+3osE2fqTjJU7GPCg1JImEAC+DA2OuARIDjDhbE1pJwFJXo
JphrbAnYEQw6TpXEEVTEzDjBowlMhNE8SDHyQbKHMIII5HXZkFbgWcvRHqBwWBnElA6dFp5hwWUTa1WQViqkavqtkpmigH9UvzpJ41CC1ANMTC9VIPQD5ZgasomaEkJigzmU7Ic/BNLdWjn/cu/Gth6BaNnORhTKaeT1VvO8GGWexT0YNIlfo0LxIakyabaBUXLIiEnWzV8TFBiX7GIyGE6RulEtrTMTOZ6q/g0kTyVF1ZdqL3Ha4TvGYWC7AW5pZQQmjDk07u7teB8vrPBNQxdZNb0rVehEECPoweDHZVCiz0+wclFbSvGZvuShaZgsGfJOSvTRdduwrCJ61gxPIpTZFSi22tBDjGTMfw/yb+9OgJmny1PPv6Gaop22ZWYSdO7MQKIEhQOcWw6g7Vb6cFQBFRwycWH0EgPPBaEBMowZwfwhJEVElLOK3wVdnKQt3KR6KZxCTqlPD8Z3CxNVcqIkCStLBaKkWgOX5PEXckU4UJAno6Zw\022\037ref_Coder_FastPrimitivesCoder_7"
}
inputs {
key: "in"
value: "pcollection_1"
}
outputs {
key: "out"
value: "pcollection_2"
}
environment_id: "ref_Environment_default_environment_1"
annotations {
key: "python_type"
value: "apache_beam.transforms.core.CombinePerKey"
}
INFO:root:unique_name:
"Do/CombinePerKey/CombinePerKey(PreCombineFn)/ExtractOutputs"
spec {
urn: "beam:transform:combine_per_key_extract_outputs:v1"
payload: "\n\260\013\n
beam:combinefn:pickled_python:v1\032\213\013\n\210\013QlpoOTFBWSZTWSejpnAAAvl/9P//1blC8Zv3n3/f5r////Zj5x4QFAAAAQQAUAO+uzXW27ruM7WdYSSQp4JCeg1NMR4hP1AjAJtTIwBBgjCDATEwGiGmGplGI0TChkaaAANPUNGgAAAAAAAAAABKaJE01TZo1NCbUeoPUPUAaeoNAAAAAAAAAeoHlDgABoAGgDQAAGjEAA0AAAAADQASmqaAFGaap+inqeUyaB6gGmJ6Ro2iDTRo0PUZAPU9IGmQACJNfGbYf9uAQYQQGEMLHYYh+sgH7Hmi/00ptIf8Auac02UitRt1Y5CWzcaWjZfGQ2NWnED4py8y6HLCYKLNWgmENmTDOIZDSk40REEQiHcfLIy4xkZwy1GEzH819JmdYQR4vgGJYwxYj03YDcEHuERBK1SGGOpoBHaC6F5GXXOR08euPzsLNQZC5WgHFK98logwMMPuc4ffc6oJpEQ/D933+//szND08iaOCGGOOLfCNgTMj1z61A+DBTQ5oMzK5exvY7nDyiThEqZ7J21YCdY6B6dZPgiJBGYRJIuPV1p1ttE7PLXceZcrgfcMoqdKZFBTEhviKyhoG31vUcHOtE4FLZQiRjb7vkUBCfPP6rgqvNIw6xosiMYwVUQXLrdUIDJRYyOFO7PrdVsQ6ae+9NAqx15YOagHQjOErBQaeZaEwSwcYdWggYmSlgVzT5YglzAygT6tgIkCLuGzVcIyeW3YefuQ4ERQiZ7cIigzX4Te1/mw1dL2ksFzS1Z9Yxfsas01JNesykI08fMTJDeZIHFwD2qyD1stverBUM3NzqWf7kX8qOFk2k7gYRK05qnq6Uhoy7PB9tQPf8rTu+CIi/clKTu+ExFGPDiTl+3osE2fqTjJU7GPCg1JImEAC+DA2OuARIDjDhbE1pJwFJXo
JphrbAnYEQw6TpXEEVTEzDjBowlMhNE8SDHyQbKHMIII5HXZkFbgWcvRHqBwWBnElA6dFp5hwWUTa1WQViqkavqtkpmigH9UvzpJ41CC1ANMTC9VIPQD5ZgasomaEkJigzmU7Ic/BNLdWjn/cu/Gth6BaNnORhTKaeT1VvO8GGWexT0YNIlfo0LxIakyabaBUXLIiEnWzV8TFBiX7GIyGE6RulEtrTMTOZ6q/g0kTyVF1ZdqL3Ha4TvGYWC7AW5pZQQmjDk07u7teB8vrPBNQxdZNb0rVehEECPoweDHZVCiz0+wclFbSvGZvuShaZgsGfJOSvTRdduwrCJ61gxPIpTZFSi22tBDjGTMfw/yb+9OgJmny1PPv6Gaop22ZWYSdO7MQKIEhQOcWw6g7Vb6cFQBFRwycWH0EgPPBaEBMowZwfwhJEVElLOK3wVdnKQt3KR6KZxCTqlPD8Z3CxNVcqIkCStLBaKkWgOX5PEXckU4UJAno6Zw\022\037ref_Coder_FastPrimitivesCoder_7"
}
```
IIUC using cloudpickle, a same instance is shared across the code causing
this error. I will double check one more time.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]