from:"Wei Zhong"

Re: Install/Run Streaming Anomaly Detection R package in Flink

2021-02-23 Thread Wei Zhong

chatryan 写道： > > Hi, > > I'm pulling in Wei Zhong and Xingbo Huang who know PyFlink better. > > Regards, > Roman > > > On Mon, Feb 22, 2021 at 3:01 PM Robert Cullen <mailto:cinquate...@gmail.com>> wrote: > My customer wants us to install thi

Re: pyflink的py4j里是不是可以调用我自己写的java程序？

2021-02-05 Thread Wei Zhong

尝试调用: get_gateway().jvm.Test2.Test2.main(None) > 在 2021年2月5日，18:27，瞿叶奇 <389243...@qq.com> 写道： > > 老师，您好，列表参数就不在报错，但是还是没有加载进去。 > >>> from pyflink.util.utils import add_jars_to_context_class_loader > >>> add_jars_to_context_class_loader(['file:///root/Test2.jar > >>> ']) > >>> from

Re: pyflink的py4j里是不是可以调用我自己写的java程序？

2021-02-05 Thread Wei Zhong

图片似乎无法加载，不过我猜应该是参数类型问题?这个函数需要参数类型为List: add_jars_to_context_class_loader(["file:///xxx "]) > 在 2021年2月5日，17:48，瞿叶奇 <389243...@qq.com> 写道： > > 老师，您好， > 我升级到了flink1.12.0了，用这个函数加载类报错了，是我url写的有问题吗？pyfink有没有连接hdfs认证kerberos的方法呢？ > > > > > -- 原始邮件 -- > 发件人:

Re: pyflink的py4j里是不是可以调用我自己写的java程序？

2021-02-05 Thread Wei Zhong

Hi, 首先你需要将你的java程序打成jar包，之后热加载你的jar包, 目前pyflink里有一个util方法可以直接调用: from pyflink.util.utils import add_jars_to_context_class_loader add_jars_to_context_class_loader("file:///xxx ") # 注意需要是url格式的路径然后就能通过java gateway进行调用了: from pyflink.java_gateway import get_gateway

Re: pyflink连接kerberos的kafka问题

2021-02-02 Thread Wei Zhong

Hi, 第一个问题应该是通过你现在的配置找不到对应的KDC realm, 可以继续尝试使用System.setProperty手动配置, 例如 System.setProperty("java.security.krb5.realm", ""); System.setProperty("java.security.krb5.kdc","”); 第二个问题, 'update-mode’=‘append'指的是只接受来自上游算子的append消息，而不是写文件时采用append模式。我想你可能想要配置的属性是'format.write-mode’='OVERWRITE’?

Re: 请问pyflink如何跟kerberos认证的kafka对接呢

2021-01-31 Thread Wei Zhong

Hi, 看你之前发的邮件，你现在是把kerberos相关的配置放在某一个flink-conf.yaml里，然后启动了一个local模式吧？但是local模式的pyflink shell是不会主动读取任何flink-conf.yaml的。需要配置环境变量FLINK_HOME，将相关配置写入$FLINK_HOME/conf/flink-conf.yaml里，并且只有在提交job时候(flink run、remote模式或者yarn模式)才会去读取flink-conf.yaml里的内容。如果执意要在local模式下尝试，可以通过以下代码: from

Re: [ANNOUNCE] Apache Flink 1.12.1 released

2021-01-19 Thread Wei Zhong

Thanks Xintong for the great work! Best, Wei > 在 2021年1月19日，18:00，Guowei Ma 写道： > > Thanks Xintong's effort! > Best, > Guowei > > > On Tue, Jan 19, 2021 at 5:37 PM Yangze Guo > wrote: > Thanks Xintong for the great work! > > Best, > Yangze Guo > > On Tue, Jan

Re: [ANNOUNCE] Apache Flink 1.12.1 released

2021-01-19 Thread Wei Zhong

Thanks Xintong for the great work! Best, Wei > 在 2021年1月19日，18:00，Guowei Ma 写道： > > Thanks Xintong's effort! > Best, > Guowei > > > On Tue, Jan 19, 2021 at 5:37 PM Yangze Guo > wrote: > Thanks Xintong for the great work! > > Best, > Yangze Guo > > On Tue, Jan

Re: PyFlink on Yarn, Per-Job模式，如何增加多个外部依赖jar包?

2021-01-05 Thread Wei Zhong

Hi Zhizhao, 能检查一下'file://' 后面跟的是绝对路径吗？这个报错是因为对应的路径在本地磁盘上找不到导致的。 > 在 2021年1月6日，10:23，Zhizhao Shangguan 写道： > > Hi: > PyFlink on Yarn, > Per-Job模式，如何增加多个外部依赖jar包？比如flink-sql-connector-kafka、flink-connector-jdbc等。 > > 环境信息 > Flink 版本：1.11.0 > Os: mac > > 尝试了如下方案，遇到了一些问题 > 1、

Re: pyflink 1.12 是不支持通过sql 直接向数据库获取数据的操作么？没看到相关接口

2020-12-22 Thread Wei Zhong

你好, pyflink需要通过声明jdbc connector的方式来从数据库中获取数据。 > 在 2020年12月22日，17:40，肖越 <18242988...@163.com> 写道： > > 例如：pandas.read_sql()的用法，直接返回源数据，pyflink小白，蹲大佬的答复。

Re: pyflink1.12 进行多表关联后的结果类型是TableResult，如何转为Table类型

2020-12-22 Thread Wei Zhong

你好, 使用env.sql_update()执行select语句可以获得Table类型的结果。 > 在 2020年12月22日，13:25，肖越 <18242988...@163.com> 写道： > > 通过sql进行左连接查询，sql语句为： > sql = ''' Insert into print_sink select a.id, a.pf_id, b.symbol_id from a \ > left join b on b.day_id = a.biz_date where a.ccy_type = 'AC' and \ >

Re: pyflink1.12 连接Mysql报错： Missing required options

2020-12-21 Thread Wei Zhong

Hi, 正如报错中提示的，with参数里需要的是"url"参数，你可以尝试将connector.url改成url试试看会不会报错了。 > 在 2020年12月21日，13:44，肖越 <18242988...@163.com> 写道： > > 在脚本中定义了两个源数据 ddl，但是第二就总会报缺option的问题，pyflink小白，求大神解答？ > #DDL定义 > source_ddl2 = """CREATE TABLE ts_pf_sec_yldrate (id DECIMAL,pf_id VARCHAR,\ > >symbol_id

Re: Urgent help on S3 CSV file reader DataStream Job

2020-12-16 Thread Wei Zhong

Deep > > On Mon, 14 Dec, 2020, 10:28 AM DEEP NARAYAN Singh, <mailto:about.d...@gmail.com>> wrote: > Hi Wei, > No problem at all.Thanks for your response. > Yes ,it is just starting from the beginning like no check pointing finished. > > Thanks, > -Deep >

Re: Pandas UDF处理过的数据sink问题

2020-12-13 Thread Wei Zhong

Hi Lucas, 是这样的，这个Pandas的输出类型是一列Row, 而你现在的sink需要接收的是一列BIGINT和一列INT。你可以尝试将sql语句改成以下形式: select orderCalc(code, amount).get(0), orderCalc(code, amount).get(1) from `some_source` group by TUMBLE(eventTime, INTERVAL '1' SECOND), code, amount 此外你这里实际是Pandas UDAF的用法吧，如果是的话则需要把”@udf”换成”@udaf” Best,

Re: [ANNOUNCE] Apache Flink 1.12.0 released

2020-12-10 Thread Wei Zhong

Congratulations! Thanks Dian and Robert for the great work! Best, Wei > 在 2020年12月10日，20:26，Leonard Xu 写道： > > > Thanks Dian and Robert for the great work as release manager ! > And thanks everyone who makes the release possible ! > > > Best, > Leonard > >> 在 2020年12月10日，20:17，Robert

Re: Urgent help on S3 CSV file reader DataStream Job

2020-12-08 Thread Wei Zhong

t()) { > return true; >} else { > throw new ParseException("Row too short: " + new String(bytes, offset, > numBytes, getCharset())); >} > } > Let me know if you need any details. > Thanks, > -Deep > > > > > > On Tue,

Re: Urgent help on S3 CSV file reader DataStream Job

2020-12-07 Thread Wei Zhong

Regards, > -Deep > > > On Mon, Dec 7, 2020 at 6:38 PM Till Rohrmann wrote: > Hi Deep, > > Could you use the TextInputFormat which reads a file line by line? That way > you can do the JSON parsing as part of a mapper which consumes the file > lines. > > Cheers, &g

Re: Urgent help on S3 CSV file reader DataStream Job

2020-12-07 Thread Wei Zhong

Hi Deep, (redirecting this to user mailing list as this is not a dev question) You can try to set the line delimiter and field delimiter of the RowCsvInputFormat to a non-printing character (assume there is no non-printing characters in the csv files). It will read all the content of a csv

Re: SQL解析复杂JSON问题

2020-12-04 Thread Wei Zhong

是的，1.11想做JSON的自定义解析和映射只能在json format以外的地方进行了 > 在 2020年12月4日，17:19，李轲写道： > > 如果1.11想做自定义解析和映射，只能通过udf么？ > > 发自我的iPhone > >> 在 2020年12月4日，16:52，Wei Zhong 写道： >> >> Hi 你好, >> >> 这个取决于你使用的flink版本，1.11版本会自动从table schema中解析，而1.10版本如果table

Re: SQL解析复杂JSON问题

2020-12-04 Thread Wei Zhong

Hi 你好, 这个取决于你使用的flink版本，1.11版本会自动从table schema中解析，而1.10版本如果table schema和json schema不是完全相同的话，需要手动写json-schema: https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table/connectors/formats/json.html#data-type-mapping

Re: 用代码执行flink sql 报错 Multiple factories for identifier 'jdbc' that implement

2020-12-03 Thread Wei Zhong

这个错误信息显示问题在后续版本已经修复，新版本发布后升级版本就能直接从错误信息中看到是哪些TableFactory冲突了: https://issues.apache.org/jira/browse/FLINK-20186 <https://issues.apache.org/jira/browse/FLINK-20186> > 在 2020年12月3日，20:08，Wei Zhong 写道： > > Hi, > > 现在的查找TableFactory的代码在错误信息显示上似乎存在问题，看不到真实的类名，可以先手动执行一下

Re: 用代码执行flink sql 报错 Multiple factories for identifier 'jdbc' that implement

2020-12-03 Thread Wei Zhong

Hi, 现在的查找TableFactory的代码在错误信息显示上似乎存在问题，看不到真实的类名，可以先手动执行一下以下代码查看到底是哪些类被判定为JDBC的DynamicTableSinkFactory了： List result = new LinkedList<>(); ServiceLoader .load(Factory.class, Thread.currentThread().getContextClassLoader()) .iterator() .forEachRemaining(result::add); List jdbcResult =

Re: Flink CEP 动态加载 pattern

2020-12-02 Thread Wei Zhong

Hi 你好, 现在Flink CEP还不支持动态加载规则。社区现在有一个JIRA来跟踪这个需求： https://issues.apache.org/jira/browse/FLINK-7129 您可以关注这个JIRA来获取最新进展。 > 在 2020年12月2日，17:48，huang botao 写道： > > Hi，在项目中常遇到规则变更的情况，我们一般怎么动态加载这些规则？Flink CEP有原生支持动态加载规则的API吗？

Re: PyFlink - Scala UDF - How to convert Scala Map in Table API?

2020-11-18 Thread Wei Zhong

Type conversions working: > - scala.collection.immutable.Map[String,String] => org.apache.flink.types.Row > => ROW > - scala.collection.immutable.Map[String,String] => > java.util.Map[String,String] => MAP > > Any hint for Map[String,Any] ? > > Best regard

Re: pyflink利用sql ddl连接hbase-1.4.x出错Configuring the input format (null) failed: Cannot create connection to HBase

2020-11-18 Thread Wei Zhong

Hi 你好, 看root cause是 io.netty.channel.EventLoopGroup 这个类找不到，能否检查一下classpath里是否包含netty的jar包，亦或相关jar包中是否shade了netty库？ > 在 2020年11月16日，17:02，ghostviper 写道： > > *环境配置如下：* > hbase-1.4.13 > flink-1.11.1 > python-3.6.1 > pyflink-1.0 > > *已做配置如下：* > 1.hadoop classpath下已经加入hbase路径

Re: pyflink 1.11 运行pyflink作业时报错

2020-11-18 Thread Wei Zhong

Hi 你好, 只看目前的报错看不出问题来，请问能贴出出错部分的job源码吗？ > 在 2020年11月17日，16:58，whh_960101 写道： > > Hi，各位大佬，pyflink 1.11 将pyflink作业提交到yarn集群运行，作业在将处理后的main_table > insert到sink端的kafka时报错File "/home/cdh272705/poc/T24_parse.py", line 179, in > from_kafka_to_oracle_demo > >

Re: PyFlink - Scala UDF - How to convert Scala Map in Table API?

2020-11-17 Thread Wei Zhong

aType > [error] (x$1: > org.apache.flink.table.api.DataTypes.Field*)org.apache.flink.table.types.DataType > [error] cannot be applied to (org.apache.flink.table.types.DataType, > org.apache.flink.table.types.DataType) > [error] override def getResultType(signature: Array[Class[_]]): > TypeInformation[_] = Da

Re: PyFlink - Scala UDF - How to convert Scala Map in Table API?

2020-11-17 Thread Wei Zhong

larFunction { > def eval(): Row = { > Row.of(java.lang.String.valueOf("foo"), java.lang.String.valueOf("bar")) > } > } > > Best regards, > > Le mar. 17 nov. 2020 à 10:04, Wei Zhong <mailto:weizhong0...@gmail.com>> a écrit : > Hi Pierre, &g

Re: PyFlink - Scala UDF - How to convert Scala Map in Table API?

2020-11-17 Thread Wei Zhong

Hi Pierre, You can try to replace the '@DataTypeHint("ROW")' with '@FunctionHint(output = new DataTypeHint("ROW”))' Best, Wei > 在 2020年11月17日，15:45，Pierre Oberholzer 写道： > > Hi Dian, Community, > > (bringing the thread back to wider audience) > > As you suggested, I've tried to use

Re: [ANNOUNCE] New PMC member: Dian Fu

2020-08-28 Thread Wei Zhong

Congratulations Dian! > 在 2020年8月28日，14:29，Jingsong Li 写道： > > Congratulations , Dian! > > Best, Jingsong > > On Fri, Aug 28, 2020 at 11:06 AM Walter Peng > wrote: > congrats! > > Yun Tang wrote: > > Congratulations , Dian! > > > -- > Best, Jingsong Lee

Re: [DISCUSS] FLIP-133: Rework PyFlink Documentation

2020-08-05 Thread Wei Zhong

gt; >> improve the Python API doc. > >>> >> > >>> >> I have received many feedbacks from PyFlink beginners about > >>> >> the PyFlink doc, e.g. the materials are too few, the Python doc is > >>> mixed > >>> >&

Re: [DISCUSS] FLIP-133: Rework PyFlink Documentation

2020-08-05 Thread Wei Zhong

gt; >> improve the Python API doc. > >>> >> > >>> >> I have received many feedbacks from PyFlink beginners about > >>> >> the PyFlink doc, e.g. the materials are too few, the Python doc is > >>> mixed > >>> >&

Re: PyFlink DDL UDTF join error

2020-07-29 Thread Wei Zhong

28日，21:19，Wei Zhong 写道： > > Hi Manas, > > It seems like a bug. You can try to replace the udtf sql call with such code > as a workaround currently: > > t_env.register_table("tmp_view", > t_env.from_path(f"{INPUT_TABLE}").join_lateral("split(

Re: PyFlink DDL UDTF join error

2020-07-28 Thread Wei Zhong

Hi Manas, It seems like a bug. You can try to replace the udtf sql call with such code as a workaround currently: t_env.register_table("tmp_view", t_env.from_path(f"{INPUT_TABLE}").join_lateral("split(data) as (featureName, featureValue)")) This works for me. I’ll try to find out what caused

Re: [ANNOUNCE] Apache Flink 1.11.1 released

2020-07-22 Thread Wei Zhong

Congratulations! Thanks Dian for the great work! Best, Wei > 在 2020年7月22日，15:09，Leonard Xu 写道： > > Congratulations! > > Thanks Dian Fu for the great work as release manager, and thanks everyone > involved! > > Best > Leonard Xu > >> 在 2020年7月22日，14:52，Dian Fu 写道： >> >> The Apache Flink

Re: [ANNOUNCE] Apache Flink 1.11.1 released

2020-07-22 Thread Wei Zhong

Congratulations! Thanks Dian for the great work! Best, Wei > 在 2020年7月22日，15:09，Leonard Xu 写道： > > Congratulations! > > Thanks Dian Fu for the great work as release manager, and thanks everyone > involved! > > Best > Leonard Xu > >> 在 2020年7月22日，14:52，Dian Fu 写道： >> >> The Apache Flink

Re: windows用户使用pyflink问题

2020-04-27 Thread Wei Zhong

Hi Tao, PyFlink 的windows支持正在开发中，预计在1.11发布。届时可以解决在windows下开发PyFlink的问题。 > 在 2020年4月28日，10:23，tao siyuan 写道： > > 好的，我试试 > > Zhefu PENG 于2020年4月28日周二上午10:16写道： > >> 可以尝试在external lib把site-packages下的内容都添加进去，可以帮助提升开发效率。 >> >> On Tue, Apr 28, 2020 at 10:13 tao siyuan wrote: >> >>>

Re: [VOTE] Release Flink Python API(PyFlink) 1.9.2 to PyPI, release candidate #1

2020-02-10 Thread Wei Zhong

Hi, Thanks for driving this, Jincheng. +1 (non-binding) - Verified signatures and checksums. - Verified README.md and setup.py. - Run `pip install apache-flink-1.9.2.tar.gz` in Python 2.7.15 and Python 3.7.5 successfully. - Start local pyflink shell in Python 2.7.15 and Python 3.7.5 via

Re: [VOTE] Release Flink Python API(PyFlink) 1.9.2 to PyPI, release candidate #1

2020-02-10 Thread Wei Zhong

Hi, Thanks for driving this, Jincheng. +1 (non-binding) - Verified signatures and checksums. - Verified README.md and setup.py. - Run `pip install apache-flink-1.9.2.tar.gz` in Python 2.7.15 and Python 3.7.5 successfully. - Start local pyflink shell in Python 2.7.15 and Python 3.7.5 via

Re: [DISCUSS] Upload the Flink Python API 1.9.x to PyPI for user convenience.

2020-02-03 Thread Wei Zhong

Hi Jincheng, Thanks for bring up this discussion! +1 for this proposal. Building from source takes long time and requires a good network environment. Some users may not have such an environment. Uploading to PyPI will greatly improve the user experience. Best, Wei jincheng sun 于2020年2月4日周二

Re: [DISCUSS] Upload the Flink Python API 1.9.x to PyPI for user convenience.

2020-02-03 Thread Wei Zhong

Hi Jincheng, Thanks for bring up this discussion! +1 for this proposal. Building from source takes long time and requires a good network environment. Some users may not have such an environment. Uploading to PyPI will greatly improve the user experience. Best, Wei jincheng sun 于2020年2月4日周二

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Wei Zhong

Congrats Dian Fu! Well deserved! Best, Wei > 在 2020年1月16日，18:10，Hequn Cheng 写道： > > Congratulations, Dian. > Well deserved! > > Best, Hequn > > On Thu, Jan 16, 2020 at 6:08 PM Leonard Xu > wrote: > Congratulations! Dian Fu > > Best, > Leonard > >> 在

Re: [ANNOUNCE] Apache Flink 1.8.3 released

2019-12-11 Thread Wei Zhong

Thanks Hequn for being the release manager. Great work! Best, Wei > 在 2019年12月12日，15:27，Jingsong Li 写道： > > Thanks Hequn for your driving, 1.8.3 fixed a lot of issues and it is very > useful to users. > Great work! > > Best, > Jingsong Lee > > On Thu, Dec 12, 2019 at 3:25 PM jincheng sun

Re: yarn-session模式通过python api消费kafka数据报错

2019-12-09 Thread Wei Zhong

Hi 改改，看现在的报错，可能是kafka版本不匹配，你需要放入lib目录的kafka connector 需要是0.11版本的，即flink-sql-connector-kafka-0.11_2.11-1.9.1.jar > 在 2019年12月10日，10:06，改改写道： > > HI Wei Zhong , > 感谢您的回复，flink的lib目录下已经放了kafka connector的jar包的，我的flink/lib目录下文件目录如下： > > <560079166

Re: yarn-session模式通过python api消费kafka数据报错

2019-12-09 Thread Wei Zhong

Hi 改改，只看这个报错的话信息量太少不能确定，不过一个可能性比较大的原因是kafka connector的jar包没有放到lib目录下，能否检查一下你的flink的lib目录下是否存在kafka connector的jar包？ > 在 2019年12月6日，14:36，改改写道： > > > [root@hdp02 bin]# ./flink run -yid application_1575352295616_0014 -py > /opt/tumble_window.py > 2019-12-06 14:15:48,262 INFO

45 matches

Mail list logo