Re: flink??ScalarFunction????????

2020-08-03 文章 ??????
udf??

flink??ScalarFunction????????

2020-08-02 文章 ??????
  
flinkudf??ScalarFunctionkafkaScalarFunction??udf1??flinkkafka??flinkudf1kafka??flink??udf1??udf1flink
  

Re: flink ScalarFunction 重写 getParameterTypes 方法不生效

2019-11-11 文章 LakeShen
getParameterTypes 不用重写

rockey...@163.com  于 2019年11月11日周一 下午5:52写道:

>
> 大家好:
> 我在 tableapi 中使用自定义 UDF ,其中 ScalarFunction 在 重写 getParameterTypes
> 后,语法语义检查并没有生效,而是任务启动后报出错误(重点 TableFunction 正常生效,怀疑是 ScalarFunction  这边的
> bug)。
> ScalarFunction  如下:
> public class Fun extends ScalarFunction {
>
> public Object eval(Object... params) {
> return "fun";
> }
>
> @Override
> public TypeInformation[] getParameterTypes(Class[] signature) {
> return new RowTypeInfo(Types.LONG).getFieldTypes();
> }
> }
> main方法
> public static void main(String[] args) throws Exception {
> StreamExecutionEnvironment env =
> StreamExecutionEnvironment.getExecutionEnvironment();
> StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);
> DataStreamSource stringDataStreamSource = env.fromElements(
> "1001,adc0:x,100",
> "1002,adc1:x,100",
> "1003,adc2:x,100",
> "1004,adc3:x,100",
> "1005,adc4:x,100",
> "1006,adc5:x,100"
> );
> TypeInformation[] types = new TypeInformation[]{Types.LONG,
> Types.STRING, Types.LONG};
> RowTypeInfo typeInformation = new RowTypeInfo(
> types,
> new String[]{"id", "url", "clickTime"});
>
> DataStream stream =
> stringDataStreamSource.map().returns(typeInformation);
>
> tableEnv.registerFunction("fun", new Fun());
> tableEnv.registerDataStream("user_click_info", stream,
> String.join(",", typeInformation.getFieldNames()));
>
> String sql = " select *,fun(url) from user_click_info";
> Table table = tableEnv.sqlQuery(sql);
> DataStream result = tableEnv.toAppendStream(table, Row.class);
> result.print();
> table.printSchema();
> tableEnv.execute("test");
> }
> 可以看到 url 定义为 string 类型,Fun 方法找那个参数声明为 long 类型,此时应该报出语法语义错误,实际报错情况如下:
>
> Exception in thread "main"
> org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
> at
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
> at
> org.apache.flink.runtime.minicluster.MiniCluster.executeJobBlocking(MiniCluster.java:626)
> at
> org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:117)
> at
> org.apache.flink.table.planner.delegation.StreamExecutor.execute(StreamExecutor.java:46)
> at
> org.apache.flink.table.api.internal.TableEnvironmentImpl.execute(TableEnvironmentImpl.java:410)
> at
> com.rock.flink19.tablefunction.TableFunctionTest.main(TableFunctionTest.java:77)
> Caused by: java.lang.RuntimeException: Could not instantiate generated
> class 'StreamExecCalc$11'
> at
> org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:67)
> at
> org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:47)
> at
> org.apache.flink.streaming.runtime.tasks.OperatorChain.createChainedOperator(OperatorChain.java:428)
> at
> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOutputCollector(OperatorChain.java:354)
> at
> org.apache.flink.streaming.runtime.tasks.OperatorChain.createChainedOperator(OperatorChain.java:418)
> at
> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOutputCollector(OperatorChain.java:354)
> at
> org.apache.flink.streaming.runtime.tasks.OperatorChain.createChainedOperator(OperatorChain.java:418)
> at
> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOutputCollector(OperatorChain.java:354)
> at
> org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:144)
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:373)
> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.flink.api.common.InvalidProgramException: Table
> program cannot be compiled. This is a bug. Please file an issue.
> at
> org.apache.flink.table.runtime.generated.CompileUtils.doCompile(CompileUtils.java:81)
> at
> org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:65)
> at
> org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:78)
> at
> org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:65)
> ... 12 more
> Caused by: org.codehaus.commons.compiler.CompileException: Line 101,
> Column 187: Cannot cast "org.apache.flink.table.dataformat.BinaryString" to
> "java.lang.Long"
> at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:12124)
> at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:5049)
> at org.codehaus.janino.UnitCompiler.access$8600(UnitCompiler.java:215)
> at org.codehaus.janino.UnitCompiler$16.visitCast(UnitCompiler.java:4416)
> at org.codehaus.janino.UnitCompiler$16.visitCast(UnitCompiler.java:4394)
> at 

flink ScalarFunction 重写 getParameterTypes 方法不生效

2019-11-11 文章 rockey...@163.com

大家好:
我在 tableapi 中使用自定义 UDF ,其中 ScalarFunction 在 重写 getParameterTypes 
后,语法语义检查并没有生效,而是任务启动后报出错误(重点 TableFunction 正常生效,怀疑是 ScalarFunction  这边的 bug)。
ScalarFunction  如下:
public class Fun extends ScalarFunction {

public Object eval(Object... params) {
return "fun";
}

@Override
public TypeInformation[] getParameterTypes(Class[] signature) {
return new RowTypeInfo(Types.LONG).getFieldTypes();
}
}
main方法
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = 
StreamExecutionEnvironment.getExecutionEnvironment();
StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env);
DataStreamSource stringDataStreamSource = env.fromElements(
"1001,adc0:x,100",
"1002,adc1:x,100",
"1003,adc2:x,100",
"1004,adc3:x,100",
"1005,adc4:x,100",
"1006,adc5:x,100"
);
TypeInformation[] types = new TypeInformation[]{Types.LONG, Types.STRING, 
Types.LONG};
RowTypeInfo typeInformation = new RowTypeInfo(
types,
new String[]{"id", "url", "clickTime"});

DataStream stream = 
stringDataStreamSource.map().returns(typeInformation);

tableEnv.registerFunction("fun", new Fun());
tableEnv.registerDataStream("user_click_info", stream, String.join(",", 
typeInformation.getFieldNames()));

String sql = " select *,fun(url) from user_click_info";
Table table = tableEnv.sqlQuery(sql);
DataStream result = tableEnv.toAppendStream(table, Row.class);
result.print();
table.printSchema();
tableEnv.execute("test");
}
可以看到 url 定义为 string 类型,Fun 方法找那个参数声明为 long 类型,此时应该报出语法语义错误,实际报错情况如下:

Exception in thread "main" 
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at 
org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
at 
org.apache.flink.runtime.minicluster.MiniCluster.executeJobBlocking(MiniCluster.java:626)
at 
org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:117)
at 
org.apache.flink.table.planner.delegation.StreamExecutor.execute(StreamExecutor.java:46)
at 
org.apache.flink.table.api.internal.TableEnvironmentImpl.execute(TableEnvironmentImpl.java:410)
at 
com.rock.flink19.tablefunction.TableFunctionTest.main(TableFunctionTest.java:77)
Caused by: java.lang.RuntimeException: Could not instantiate generated class 
'StreamExecCalc$11'
at 
org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:67)
at 
org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.createStreamOperator(CodeGenOperatorFactory.java:47)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain.createChainedOperator(OperatorChain.java:428)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain.createOutputCollector(OperatorChain.java:354)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain.createChainedOperator(OperatorChain.java:418)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain.createOutputCollector(OperatorChain.java:354)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain.createChainedOperator(OperatorChain.java:418)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain.createOutputCollector(OperatorChain.java:354)
at 
org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:144)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:373)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.api.common.InvalidProgramException: Table program 
cannot be compiled. This is a bug. Please file an issue.
at 
org.apache.flink.table.runtime.generated.CompileUtils.doCompile(CompileUtils.java:81)
at 
org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:65)
at 
org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:78)
at 
org.apache.flink.table.runtime.generated.GeneratedClass.newInstance(GeneratedClass.java:65)
... 12 more
Caused by: org.codehaus.commons.compiler.CompileException: Line 101, Column 
187: Cannot cast "org.apache.flink.table.dataformat.BinaryString" to 
"java.lang.Long"
at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:12124)
at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:5049)
at org.codehaus.janino.UnitCompiler.access$8600(UnitCompiler.java:215)
at org.codehaus.janino.UnitCompiler$16.visitCast(UnitCompiler.java:4416)
at org.codehaus.janino.UnitCompiler$16.visitCast(UnitCompiler.java:4394)
at org.codehaus.janino.Java$Cast.accept(Java.java:4887)
at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:4394)
at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:5055)
at