回复:yarn-session模式通过python api消费kafka数据报错
hi wei zhang, 非常感谢,终于跑起来了,感谢你这么耐心的指导初学者。 我当时从编译的源码中拷贝的时flink-sql-connector-kafka 而不是flink-sql-connector-kafka-0.11,所以版本不必配。 再次感谢,祝工作顺利。 -- 发件人:Wei Zhong 发送时间:2019年12月10日(星期二) 10:23 收件人:改改 抄 送:user-zh 主 题:Re: yarn-session模式通过python api消费kafka数据报错 Hi 改改, 看现在的报错,可能是kafka版本不匹配,你需要放入lib目录的kafka connector 需要是0.11版本的,即flink-sql-connector-kafka-0.11_2.11-1.9.1.jar 在 2019年12月10日,10:06,改改 写道: HI Wei Zhong , 感谢您的回复,flink的lib目录下已经放了kafka connector的jar包的,我的flink/lib目录下文件目录如下: <5600791664319709.png> 另外我的集群环境如下: java :1.8.0_231 flink: 1.9.1 Python 3.6.9 Hadoop 3.1.1.3.1.4.0-315 昨天试了下用python3.6 执行,依然是报错的,报错如下: [root@hdp02 data_team_workspace]# /opt/flink-1.9.1/bin/flink run -py tumble_window.py Starting execution of program Traceback (most recent call last): File "/tmp/pyflink/3fb6ccfd-482f-4426-859a-ebe003e14769/pyflink.zip/pyflink/util/exceptions.py", line 147, in deco File "/tmp/pyflink/3fb6ccfd-482f-4426-859a-ebe003e14769/py4j-0.10.8.1-src.zip/py4j/protocol.py", line 328, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o42.registerTableSource. : org.apache.flink.table.api.TableException: findAndCreateTableSource failed. at org.apache.flink.table.factories.TableFactoryUtil.findAndCreateTableSource(TableFactoryUtil.java:67) at org.apache.flink.table.factories.TableFactoryUtil.findAndCreateTableSource(TableFactoryUtil.java:54) at org.apache.flink.table.descriptors.ConnectTableDescriptor.registerTableSource(ConnectTableDescriptor.java:69) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.api.python.shaded.py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at org.apache.flink.api.python.shaded.py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at org.apache.flink.api.python.shaded.py4j.Gateway.invoke(Gateway.java:282) at org.apache.flink.api.python.shaded.py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at org.apache.flink.api.python.shaded.py4j.commands.CallCommand.execute(CallCommand.java:79) at org.apache.flink.api.python.shaded.py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.flink.table.api.NoMatchingTableFactoryException: Could not find a suitable table factory for 'org.apache.flink.table.factories.TableSourceFactory' in the classpath. Reason: No context matches. The following properties are requested: connector.properties.0.key=zookeeper.connect connector.properties.0.value=hdp03:2181 connector.properties.1.key=bootstrap.servers connector.properties.1.value=hdp02:6667 connector.property-version=1 connector.startup-mode=earliest-offset connector.topic=user_01 connector.type=kafka connector.version=0.11 format.fail-on-missing-field=true format.json-schema={ type: 'object', properties: {col1: { type: 'string'},col2: { type: 'string'},col3: { type: 'string'},time: { type: 'string', format: 'date-time'} }} format.property-version=1 format.type=json schema.0.name=rowtime schema.0.rowtime.timestamps.from=time schema.0.rowtime.timestamps.type=from-field schema.0.rowtime.watermarks.delay=6 schema.0.rowtime.watermarks.type=periodic-bounded schema.0.type=TIMESTAMP schema.1.name=col1 schema.1.type=VARCHAR schema.2.name=col2 schema.2.type=VARCHAR schema.3.name=col3 schema.3.type=VARCHAR update-mode=append The following factories have been considered: org.apache.flink.streaming.connectors.kafka.KafkaTableSourceSinkFactory org.apache.flink.formats.csv.CsvRowFormatFactory org.apache.flink.addons.hbase.HBaseTableFactory org.apache.flink.api.java.io.jdbc.JDBCTableSourceSinkFactory org.apache.flink.formats.json.JsonRowFormatFactory org.apache.flink.table.catalog.GenericInMemoryCatalogFactory org.apache.flink.table.sources.CsvBatchTableSourceFactory org.apache.flink.table.sources.CsvAppendTableSourceFactory org.apache.flink.table.sinks.CsvBatchTableSinkFactory org.apache.flink.table.sinks.CsvAppendTableSinkFactory org.apache.flink.table.planner.StreamPlannerFactory org.apache.flink.table.executor.StreamExecutorFactory org.apache.flink.table.planner.delegation.BlinkPlannerFactory org.apache.flink.table.planner.delegation.BlinkExecutorFactory at org.apache.flink.table.factories.TableFactoryService.filterByContext(TableFactoryService.java:283) at org.apache.flink.table.factories.TableFactoryService.filter(TableFactoryService.java:191) at org.apache.flink.table.factories.TableFactoryService.findSingleInterna
回复:yarn-session模式通过python api消费kafka数据报错
py.py", line 85, in _run_code exec(code, run_globals) File "/tmp/pyflink/3fb6ccfd-482f-4426-859a-ebe003e14769/tumble_window.py", line 62, in .register_table_source("source") File "/tmp/pyflink/3fb6ccfd-482f-4426-859a-ebe003e14769/pyflink.zip/pyflink/table/descriptors.py", line 1293, in register_table_source File "/tmp/pyflink/3fb6ccfd-482f-4426-859a-ebe003e14769/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1286, in __call__ File "/tmp/pyflink/3fb6ccfd-482f-4426-859a-ebe003e14769/pyflink.zip/pyflink/util/exceptions.py", line 154, in deco pyflink.util.exceptions.TableException: 'findAndCreateTableSource failed.' org.apache.flink.client.program.OptimizerPlanEnvironment$ProgramAbortException at org.apache.flink.client.python.PythonDriver.main(PythonDriver.java:83) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:576) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:438) at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:274) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:746) at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:273) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:205) at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1010) at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1083) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1083) -- 发件人:Wei Zhong 发送时间:2019年12月10日(星期二) 09:56 收件人:user-zh ; 改改 主 题:Re: yarn-session模式通过python api消费kafka数据报错 Hi 改改, 只看这个报错的话信息量太少不能确定,不过一个可能性比较大的原因是kafka connector的jar包没有放到lib目录下,能否检查一下你的flink的lib目录下是否存在kafka connector的jar包? > 在 2019年12月6日,14:36,改改 写道: > > > [root@hdp02 bin]# ./flink run -yid application_1575352295616_0014 -py > /opt/tumble_window.py > 2019-12-06 14:15:48,262 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - Found Yarn properties file under /tmp/.yarn-properties-root. > 2019-12-06 14:15:48,262 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - Found Yarn properties file under /tmp/.yarn-properties-root. > 2019-12-06 14:15:48,816 INFO org.apache.hadoop.yarn.client.RMProxy > - Connecting to ResourceManager at > hdp02.wuagecluster/10.2.19.32:8050 > 2019-12-06 14:15:48,964 INFO org.apache.hadoop.yarn.client.AHSProxy > - Connecting to Application History server at > hdp03.wuagecluster/10.2.19.33:10200 > 2019-12-06 14:15:48,973 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - No path for the flink jar passed. Using the location of class > org.apache.flink.yarn.YarnClusterDescriptor to locate the jar > 2019-12-06 14:15:48,973 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - No path for the flink jar passed. Using the location of class > org.apache.flink.yarn.YarnClusterDescriptor to locate the jar > 2019-12-06 14:15:49,101 INFO > org.apache.flink.yarn.AbstractYarnClusterDescriptor - Found > application JobManager host name 'hdp07.wuagecluster' and port '46376' from > supplied application id 'application_1575352295616_0014' > Starting execution of program > Traceback (most recent call last): > File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main >"__main__", fname, loader, pkg_name) > File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code >exec code in run_globals > File "/tmp/pyflink/b9a29ae4-89ac-4289-9111-5f77ad90d386/tumble_window.py", > line 62, in >.register_table_source("source") > File > "/tmp/pyflink/b9a29ae4-89ac-4289-9111-5f77ad90d386/pyflink.zip/pyflink/table/descriptors.py", > line 1293, in register_table_source > File > "/tmp/pyflink/b9a29ae4-89ac-4289-9111-5f77ad90d386/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", > line 1286, in __call__ > File > "/tmp/pyflink/b9a29ae4-89ac-4289-9111-5f77ad90d386/py
yarn-session模式通过python api消费kafka数据报错
[root@hdp02 bin]# ./flink run -yid application_1575352295616_0014 -py /opt/tumble_window.py 2019-12-06 14:15:48,262 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - Found Yarn properties file under /tmp/.yarn-properties-root. 2019-12-06 14:15:48,262 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - Found Yarn properties file under /tmp/.yarn-properties-root. 2019-12-06 14:15:48,816 INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hdp02.wuagecluster/10.2.19.32:8050 2019-12-06 14:15:48,964 INFO org.apache.hadoop.yarn.client.AHSProxy - Connecting to Application History server at hdp03.wuagecluster/10.2.19.33:10200 2019-12-06 14:15:48,973 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar 2019-12-06 14:15:48,973 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar 2019-12-06 14:15:49,101 INFO org.apache.flink.yarn.AbstractYarnClusterDescriptor - Found application JobManager host name 'hdp07.wuagecluster' and port '46376' from supplied application id 'application_1575352295616_0014' Starting execution of program Traceback (most recent call last): File "/usr/lib64/python2.7/runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/tmp/pyflink/b9a29ae4-89ac-4289-9111-5f77ad90d386/tumble_window.py", line 62, in .register_table_source("source") File "/tmp/pyflink/b9a29ae4-89ac-4289-9111-5f77ad90d386/pyflink.zip/pyflink/table/descriptors.py", line 1293, in register_table_source File "/tmp/pyflink/b9a29ae4-89ac-4289-9111-5f77ad90d386/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1286, in __call__ File "/tmp/pyflink/b9a29ae4-89ac-4289-9111-5f77ad90d386/pyflink.zip/pyflink/util/exceptions.py", line 154, in deco pyflink.util.exceptions.TableException: u'findAndCreateTableSource failed.' org.apache.flink.client.program.OptimizerPlanEnvironment$ProgramAbortException at org.apache.flink.client.python.PythonDriver.main(PythonDriver.java:83) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:576) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:438) at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:274) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:746) at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:273) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:205) at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1010) at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1083) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1083)
eh
w 来自钉钉专属商务邮箱