Hi, Apollo.
The processing of agg is generated by codegen. You can observe the processing
log of the generated agg class by setting the following parameters:
env.java.opts.taskmanager: "-Dorg.codehaus.janino.source_debugging.enable=true
-Dorg.codehaus.janino.source_debugging.dir=/flink/log/"
--
Best!
Xuyang
At 2024-09-21 14:21:14, "Apollo Elon" <[email protected]> wrote:
When I tried to calculate the median of the age field in a table with 120
million rows, I implemented a custom UDAF. However, there was a significant
performance difference between the two different types of accumulator. The
ListView stated that it would enable the state backend when encountering large
amounts of data. How can I observe this process and which classes are
responsible for these functions?
```java
public static class State implements Serializable {
public int scale = 2;
@DataTypeHint(value = "ARRAY<DOUBLE>")
public ArrayList<Double> numbers;
public State() {}
}
@Override
public State createAccumulator() {
State state = new State();
state.numbers = new ArrayList<>();
return state;
}
```java
public static class State implements Serializable {
public int scale = 2;
public ListView<Double> numbers;
public State() {}
}
@Override
public State createAccumulator() {
State state = new State();
state.numbers = new ListView<>();
return state;
}