GitHub user mengw15 created a discussion: slowness issue when sending data into Python-based operators
#### summary Seeing severe latency when data flows from an upstream operator into a Python-based operator. The compute inside Python is fast; the delays happen before and after it, suggesting a transmission/bridge issue. #### Example workflow Workflow A: CSV Scan → Sort (Python) ~3 minutes from workflow start until Sort begins processing Sort compute itself < 1s Then ~2 minutes before the workflow reports completion Workflow B: CSV Scan → Aggregation (Java native) End-to-end ≈ 20s #### Hypothesis Bottleneck in JVM ↔ Python handoff GitHub link: https://github.com/apache/texera/discussions/4032 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
