hive-exec依赖导致hadoop冲突问题

2020-08-24 Thread amen...@163.com
hi, everyone 组件版本:flink-1.11.1,hive-2.1.1 问题描述: 使用Table API调用executeSql()方法编写kafka2mysql实时程序demo,在未导入hive-exec依赖时,打包提交到yarn集群,正常运行; 当测试HiveCatalog及读写Hive Table时,Standalone Cluster运行无异常,在flink端正常读写hive table(不会发生hadoop依赖冲突); 但当提交到yarn时发生hadoop冲突,通过IDEA查看程序依赖得知,当引入hive-exec依赖时,会自动的带入hadoop和hdfs

Re: Flink S3 Hadoop dependencies

2020-08-14 Thread Chesnay Schepler
Saley wrote: Hi team, Was there a reason for not shading hadoop-common https://github.com/apache/flink/commit/e1e7d7f7ecc080c850a264021bf1b20e3d27d373#diff-e7b798a682ee84ab804988165e99761cR38-R44 ? This is leaking lots of classes such as guava and causing issues in our flink application. I see

Flink S3 Hadoop dependencies

2020-08-14 Thread Satish Saley
Hi team, Was there a reason for not shading hadoop-common https://github.com/apache/flink/commit/e1e7d7f7ecc080c850a264021bf1b20e3d27d373#diff-e7b798a682ee84ab804988165e99761cR38-R44 ? This is leaking lots of classes such as guava and causing issues in our flink application. I see that hadoop

Re:Re: Re: Flink 1.11.1 on k8s 如何配置hadoop

2020-08-10 Thread RS
Hi 恩, 重新试了下, 这种是可以的, 前面是我操作错了, 谢谢~ Thx 在 2020-08-10 13:36:36,"Yang Wang" 写道: >你是自己打了一个新的镜像,把flink-shaded-hadoop-2-uber-2.8.3-10.0.jar放到lib下面了吗 >如果是的话不应该有这样的问题 > >Best, >Yang > >RS 于2020年8月10日周一 下午12:04写道: > >> Hi, >> 我下载了flink-shaded-hadoop-2-uber-

Re: Re: Flink 1.11.1 on k8s 如何配置hadoop

2020-08-09 Thread Yang Wang
你是自己打了一个新的镜像,把flink-shaded-hadoop-2-uber-2.8.3-10.0.jar放到lib下面了吗 如果是的话不应该有这样的问题 Best, Yang RS 于2020年8月10日周一 下午12:04写道: > Hi, > 我下载了flink-shaded-hadoop-2-uber-2.8.3-10.0.jar, 然后放到了lib下, 重启了集群, > 但是启动任务还是会报错: > Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeExcept

Re:Re: Flink 1.11.1 on k8s 如何配置hadoop

2020-08-09 Thread RS
Hi, 我下载了flink-shaded-hadoop-2-uber-2.8.3-10.0.jar, 然后放到了lib下, 重启了集群, 但是启动任务还是会报错: Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a file system implementation for scheme 'hdfs'. The scheme is not directly supported by Flink and no Hadoop file system

Re: Flink 1.11.1 on k8s 如何配置hadoop

2020-08-09 Thread Yang Wang
Matt Wang是正确的 目前Flink发布的binary和镜像里面都没有flink-shaded-hadoop,所以需要你在官方镜像的基础再加一层 把flink-shaded-hadoop[1]打到/opt/flink/lib目录下 FROM flinkCOPY /path/of/flink-shaded-hadoop-2-uber-*.jar $FLINK_HOME/lib/ [1]. https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-hadoop-2-uber Best, Yang

回复: Flink 1.11.1 on k8s 如何配置hadoop

2020-08-07 Thread Matt Wang
官网的镜像只包含 Flink 相关的内容,如果需要连接 HDFS,你需要将 Hadoop 相关包及配置打到镜像中 -- Best, Matt Wang 在2020年08月7日 12:49,caozhen 写道: 顺手贴一下flink1.11.1的hadoop集成wiki: https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/hadoop.html 根据官网说不再提供flink-shaded-hadoop-2-uber。并给出以下两种解决方式 1、建议使用

Re: Flink 1.11.1 on k8s 如何配置hadoop

2020-08-06 Thread caozhen
顺手贴一下flink1.11.1的hadoop集成wiki: https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/hadoop.html 根据官网说不再提供flink-shaded-hadoop-2-uber。并给出以下两种解决方式 1、建议使用HADOOP_CLASSPATH加载hadoop依赖 2、或者将hadoop依赖放到flink客户端lib目录下 *我在用1.11.1 flink on yarn时,使用的是第二种方式,下载hadoop-src包,将一些常用依赖拷贝到lib目录

Flink 1.11.1 on k8s 如何配置hadoop

2020-08-06 Thread RS
Hi, Flink 1.11.1 想运行到K8S上面, 使用的镜像是flink:1.11.1-scala_2.12, 按照官网上面介绍的, 部署session cluster, jobmanager和taskmanager都启动成功了 然后提交任务的时候会报错: Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Hadoop is not in the classpath/dependencies

Re: Hadoop FS when running standalone

2020-07-16 Thread Lorenzo Nicora
Thanks Alessandro, I think I solved it. I cannot set any HADOOP_HOME as I have no Hadoop installed on the machine running my tests. But adding *org.apache.flink:flink-shaded-hadoop-2:2.8.3-10.0* as a compile dependency to the Maven profile building the standalone version fixed the issue. Lorenzo

Re: Hadoop FS when running standalone

2020-07-16 Thread Alessandro Solimando
Hi Lorenzo, IIRC I had the same error message when trying to write snappified parquet on HDFS with a standalone fat jar. Flink could not "find" the hadoop native/binary libraries (specifically I think for me the issue was related to snappy), because my HADOOP_HOME was not (properly) se

Hadoop FS when running standalone

2020-07-16 Thread Lorenzo Nicora
Hi I need to run my streaming job as a *standalone* Java application, for testing The job uses the Hadoop S3 FS and I need to test it (not a unit test). The job works fine when deployed (I am using AWS Kinesis Data Analytics, so Flink 1.8.2) I have *org.apache.flink:flink-s3-fs-hadoop

Re: 【求助】Flink Hadoop依赖问题

2020-07-16 Thread Yang Wang
你可以在Pod里面确认一下/data目录是否正常挂载,另外需要在Pod里ps看一下 起的JVM进程里的classpath是什么,有没有包括hadoop的jar 当然,使用Roc Marshal建议的增加flink-shaded-hadoop并且放到$FLINK_HOME/lib下也可以解决问题 Best, Yang Roc Marshal 于2020年7月15日周三 下午5:09写道: > > > > 你好,Z-Z, > > 可以尝试在 > https://repo1.maven.org/maven2/org/apache/flink/

Re:【求助】Flink Hadoop依赖问题

2020-07-15 Thread Roc Marshal
你好,Z-Z, 可以尝试在 https://repo1.maven.org/maven2/org/apache/flink/flink-shaded-hadoop-2-uber/ 下载对应的uber jar包,并就将下载后的jar文件放到flink镜像的 ${FLINK_HOME}/lib 路径下,之后启动编排的容器。 祝好。 Roc Marshal. 在 2020-07-15 10:47:39,"Z-Z" 写道: >我在使用Flink 1.11.0版本中,使用docker-compose搭建,docker-compose文

????????Flink Hadoop????????

2020-07-14 Thread Z-Z
manager - HADOOP_CLASSPATH=/data/hadoop-2.9.2/etc/hadoop:/data/hadoop-2.9.2/share/hadoop/common/lib/*:/data/hadoop-2.9.2/share/hadoop/common/*:/data/hadoop-2.9.2/share/hadoop/hdfs:/data/hadoop-2.9.2/share/hadoop/hdfs/lib/*:/data/hadoop-2.9.2/share/hadoop/hdfs/*:/data/hadoop-2.9.2/share/hadoop

Re: Flink Hadoop依赖

2020-07-08 Thread Xintong Song
你说的 “jobmanager的lib文件夹” 是指哪里?Flink 的部署方式是怎样的?CLI 运行在哪里? Thank you~ Xintong Song On Wed, Jul 8, 2020 at 10:59 AM Z-Z wrote: > Hi, 各位大佬们,有个问题,Flink > 1.10.0版本中,已经在jobmanager的lib文件夹添加了flink-shaded-hadoop-2-uber-2.7.5-10.0.jar文件,通过webui上传可以正常运行任务,但通过cli命令,提交任务后报Could > not find a fi

Flink Hadoop????

2020-07-07 Thread Z-Z
Hi?? ??Flink 1.10.0??jobmanager??libflink-shaded-hadoop-2-uber-2.7.5-10.0.jar??webuicli??Could not find a file system implementation for scheme 'hdfs'. The scheme is not directly supported by Flink

Re: Dockerised Flink 1.8 with Hadoop S3 FS support

2020-07-03 Thread Yang Wang
Hi Lorenzo, Since Flink 1.8 does not support plugin mechanism to load filesystem, you need to copy flink-s3-fs-hadoop-*.jar from opt to lib directory. The dockerfile could be like following. FROM flink:1.8-scala_2.11 RUN cp /opt/flink/opt/flink-s3-fs-hadoop-*.jar /opt/flink/lib Then build you

Dockerised Flink 1.8 with Hadoop S3 FS support

2020-07-02 Thread Lorenzo Nicora
cluster support for S3 Hadoop File System (s3a://), we have on KDA out of the box. Note I do not want to add dependencies to the job directly, as I want to deploy locally exactly the same JAR I deploy to KDA. Flink 1.8 docs [1] say is supported out of the box but does not look to be the case

Re: flink-s3-fs-hadoop retry configuration

2020-06-17 Thread Jeff Henrikson
into the flink-conf.yaml file results in the following DEBUG log output: 2020-05-08 16:20:47,461 DEBUG org.apache.flink.fs.s3hadoop.common.HadoopConfigLoader       [] - Adding Flink config entry for s3.connection.maximum as fs.s3a.connection.maximum to Hadoop config I guess

Re: 请问 StreamingFileSink如何写数据到其它HA的Hadoop集群,并且是yarn job

2020-06-09 Thread Yun Gao
Hello, 就是现在有没有遇到具体的错误?我理解应该需要在flink TM所运行机器上的HADOOP_CONF_DIR底下的hdfs.site配置一些参数。可能可以参考之前的提问:[1] [1] http://apache-flink.147419.n8.nabble.com/How-to-write-stream-data-to-other-Hadoop-Cluster-by-StreamingFileSink-td792.html

???? StreamingFileSink????????????????HA??Hadoop????,??????yarn job

2020-06-09 Thread ???Z?w???w
StreamingFileSinkHA??Hadoop,??yarn job ?? ?? ?? = Mobile??18611696624 QQ:79434564

??????flink ????hadoop????????

2020-05-29 Thread ??????????????
??OK?? ---- ??:""<13162790...@163.com; :2020??5??29??(??) 3:20 ??:"user-zh"

?????? flink ????hadoop????????

2020-05-29 Thread ??????????????
thanks very much ?? resource hdfs-site.xml ---- ??:"wangweigu...@stevegame.cn"

Re:flink 访问hadoop集群问题

2020-05-29 Thread 程龙
下面的代码是你本地运行的是吗 如果是本地需要最简单的方式 就是把hdfs-site.xml 和core-site.xml 配置文件放到资源目录下 在 2020-05-29 15:06:21,"了不起的盖茨比" <573693...@qq.com> 写道: >请教大家一个问题 , >hadoop服务TestHACluster,可是我用api访问时候,填写了path >hdfs://TestHACluster/user/flink/test >就会去访问TestHACluster:8020, 但是我是

????: flink ????hadoop????????

2020-05-29 Thread wangweigu...@stevegame.cn
?? ?? 2020-05-29 15:06 user-zh ?? flink hadoop ?? hadoopTestHACluster??apipath hdfs://TestHACluster/user/flink/test ??TestHACluster:8020

flink ????hadoop????????

2020-05-29 Thread ??????????????
?? hadoopTestHACluster??apipath hdfs://TestHACluster/user/flink/test ??TestHACluster:8020?? ??hiveTestHACluster:8020 StreamExecutionEnvironment

Re:Re: flink1.10怎么获得flink-shaded-hadoop包以支持hadoop3.2.1?

2020-05-20 Thread Jeff
件- >> 发件人: Jeff >> 发送时间: 2020-05-20 10:09:10 (星期三) >> 收件人: flink-zh >> 抄送: >> 主题: flink1.10怎么获得flink-shaded-hadoop包以支持hadoop3.2.1? >> >> hi all, >> 在mvnrepository里没找到支持hadoop3.2.1的flink-shaded-hadoop包, >> 在单独的flink-shaded项目里也没找到相应hadoop模块,

Re: flink1.10怎么获得flink-shaded-hadoop包以支持hadoop3.2.1?

2020-05-19 Thread Jingsong Li
Hi, 如果能用环境变量HADOOP_CLASSPATH,最好用. [1] https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/hadoop.html#providing-hadoop-classes Best, Jingsong Lee On Wed, May 20, 2020 at 10:22 AM 刘大龙 wrote: > Hi, > 你可以看一下这两个链接: > 1: https://www.mail-archive.com/dev@flink.apache.org

回复:flink1.10怎么获得flink-shaded-hadoop包以支持hadoop3.2.1?

2020-05-19 Thread 111
Hi, 下载对应的代码,重新编译下就好了。 参考资料:https://mp.weixin.qq.com/s/ox6-gPlVtZb5Cb8Tj6b3nw best. Xinghalo

Re: flink1.10怎么获得flink-shaded-hadoop包以支持hadoop3.2.1?

2020-05-19 Thread 刘大龙
Hi, 你可以看一下这两个链接: 1: https://www.mail-archive.com/dev@flink.apache.org/msg37293.html 2: https://issues.apache.org/jira/browse/FLINK-11086 > -原始邮件- > 发件人: Jeff > 发送时间: 2020-05-20 10:09:10 (星期三) > 收件人: flink-zh > 抄送: > 主题: flink1.10怎么获得flink-shaded-hadoop包以支持hadoop3.2.1

flink1.10怎么获得flink-shaded-hadoop包以支持hadoop3.2.1?

2020-05-19 Thread Jeff
hi all, 在mvnrepository里没找到支持hadoop3.2.1的flink-shaded-hadoop包, 在单独的flink-shaded项目里也没找到相应hadoop模块,请问我要怎么获得这个包呢?

Re: Testing jobs locally agains secure Hadoop cluster

2020-05-11 Thread Khachatryan Roman
Hi Őrhidi, Can you please provide some details about the errors you get? Regards, Roman On Mon, May 11, 2020 at 9:32 AM Őrhidi Mátyás wrote: > Dear Community, > > I'm having troubles testing jobs against a secure Hadoop cluster. Is that > possible? The mini cluster seems to

Testing jobs locally agains secure Hadoop cluster

2020-05-11 Thread Őrhidi Mátyás
Dear Community, I'm having troubles testing jobs against a secure Hadoop cluster. Is that possible? The mini cluster seems to not load any security modules. Thanks, Matyas

Re: flink-s3-fs-hadoop retry configuration

2020-05-08 Thread Robert Metzger
as fs.s3a.connection.maximum to Hadoop config I guess that is the recommended way of passing configuration into the S3 connectors of Flink. You also asked how to detect retries: DEBUG-log level is helpful again. I just tried connecting against an invalid port, and got these messages: 2020-05-08 16:26

Re: flink-s3-fs-hadoop retry configuration

2020-05-08 Thread Robert Metzger
Hey Jeff, Which Flink version are you using? Have you tried configuring the S3 filesystem via Flink's config yaml? Afaik all config parameters prefixed with "s3." are mirrored into the Hadoop file system connector. On Mon, May 4, 2020 at 8:45 PM Jeff Henrikson wrote: > > 2

Re: 回复:在已有 Hadoop 外搭建 standalone 模式 HA flink 集群

2020-05-07 Thread wangl...@geekplus.com.cn
我试了下是可以的,但现在有一个访问 HDFS 的问题。 我用的 hadoop 是阿里云 EMR 管理, 在 EMR 管理的机器上可以以 hdfs://emr-cluster:8020/ 访问 HDFS 但我部署的 Flink 不属于 EMR 管理,这个地址是不能解析的,我只能写成 hdfs://active-namenode-ip:8020/ 的形式,NameNode 丧失了 HA 的功能 有什么方式解决这个问题吗? 谢谢, 王磊 wangl...@geekplus.com.cn Sender: Andrew Send Time: 2020-05-07 12:31

回复:在已有 Hadoop 外搭建 standalone 模式 HA flink 集群

2020-05-06 Thread Andrew
https://ci.apache.org/projects/flink/flink-docs-stable/ops/jobmanager_high_availability.html ---原始邮件--- 发件人: "wangl...@geekplus.com.cn"

回复: 在已有 Hadoop 外搭建 standalone 模式 HA flink 集群

2020-05-06 Thread wangl...@geekplus.com.cn
看起来这个文档可以,我先试下: https://ci.apache.org/projects/flink/flink-docs-stable/ops/jobmanager_high_availability.html wangl...@geekplus.com.cn 发件人: wangl...@geekplus.com.cn 发送时间: 2020-05-07 12:23 收件人: user-zh 主题: 在已有 Hadoop 外搭建 standalone 模式 HA flink 集群 现在已经有了一个 Hadoop 集群。 我想在这个 集群外(不同的机器,网络互通)部署一个

Overriding hadoop core-site.xml keys using the flink-fs-hadoop-shaded assemblies

2020-05-05 Thread Jeff Henrikson
Has anyone had success overriding hadoop core-site.xml keys using the flink-fs-hadoop-shaded assemblies? If so, what versions were known to work? Using btrace, I am seeing a bug in the hadoop shaded dependencies distributed with 1.10.0. Some (but not all) of the core-site.xml keys cannot

Flink - Hadoop Connectivity - Unable to read file

2020-05-05 Thread Samik Mukherjee
Hi All, I am trying to get some file from HDFS which is locally installed. But I am not able to. I tried with both these ways. But all the time the program is ending with "Process finished with exit code 239." Any help will be helpful- public class Processor { public static void

Re: flink-s3-fs-hadoop retry configuration

2020-05-04 Thread Jeff Henrikson
> 2) How can I tell if flink-s3-fs-hadoop is actually managing to pick up > the hadoop configuration I have provided, as opposed to some separate > default configuration? I'm reading the docs and source of flink-fs-hadoop-shaded. I see that core-default-shaded.xml has fs.s3a.connectio

flink-s3-fs-hadoop retry configuration

2020-05-01 Thread Jeff Henrikson
Hello Flink users, I could use help with three related questions: 1) How can I observe retries in the flink-s3-fs-hadoop connector? 2) How can I tell if flink-s3-fs-hadoop is actually managing to pick up the hadoop configuration I have provided, as opposed to some separate default

Re: Hadoop user jar for flink 1.9 plus

2020-03-20 Thread Vishal Santoshi
Awesome, thanks! On Tue, Mar 17, 2020 at 11:14 AM Chesnay Schepler wrote: > You can download flink-shaded-hadoop from the downloads page: > https://flink.apache.org/downloads.html#additional-components > > On 17/03/2020 15:56, Vishal Santoshi wrote: > > We have been on flink 1

Re: Hadoop user jar for flink 1.9 plus

2020-03-17 Thread Chesnay Schepler
You can download flink-shaded-hadoop from the downloads page: https://flink.apache.org/downloads.html#additional-components On 17/03/2020 15:56, Vishal Santoshi wrote: We have been on flink 1.8.x on production and were planning to go to flink 1.9 or above. We have always used hadoop uber jar

Hadoop user jar for flink 1.9 plus

2020-03-17 Thread Vishal Santoshi
We have been on flink 1.8.x on production and were planning to go to flink 1.9 or above. We have always used hadoop uber jar from https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-hadoop2-uber but it seems they go up to 1.8.3 and their distribution ends 2019. How do or where do we

Re: Building with Hadoop 3

2020-03-04 Thread Stephan Ewen
Have you tried to just export Hadoop 3's classpath to `HADOOP_CLASSPATH` and see if that works out of the box? If the main use case is HDFS access, then there is a fair chance it might just work, because Flink uses only a small subset of the Hadoop FS API which is stable between 2.x and 3.x

RE: Building with Hadoop 3

2020-03-03 Thread LINZ, Arnaud
Hello, Have you shared it somewhere on the web already? Best, Arnaud De : vino yang Envoyé : mercredi 4 décembre 2019 11:55 À : Márton Balassi Cc : Chesnay Schepler ; Foster, Craig ; user@flink.apache.org; d...@flink.apache.org Objet : Re: Building with Hadoop 3 Hi Marton, Thanks for your

Re: Flink 1.10 - Hadoop libraries integration with plugins and class loading

2020-02-28 Thread Piotr Nowojski
Hi, > Since we have "flink-s3-fs-hadoop" at the plugins folder and therefore being > dynamically loaded upon task/job manager(s) startup (also, we are keeping > Flink's default inverted class loading strategy), shouldn't Hadoop > dependencies be loaded b

Flink 1.10 - Hadoop libraries integration with plugins and class loading

2020-02-26 Thread Ricardo Cardante
is submitted to a Flink setup running on docker, we're getting the following exception: - java.lang.NoClassDefFoundError: org/apache/hadoop/fs/Path - Which refers to the usage of that class in a RichSinkFunction while building

Re: Re: Flink connect hive with hadoop HA

2020-02-14 Thread Robert Metzger
There's a configuration value "env.hadoop.conf.dir" to set the hadoop configuration directory: https://ci.apache.org/projects/flink/flink-docs-master/ops/config.html#env-hadoop-conf-dir If the files in that directory correctly configure Hadoop HA, the client side should pick up the confi

Re:Re: Flink connect hive with hadoop HA

2020-02-10 Thread sunfulin
Hi ,guys Thanks for kind reply. Actually I want to know how to change client side haddop conf while using table API within my program. Hope some useful sug. At 2020-02-11 02:42:31, "Bowen Li" wrote: Hi sunfulin, Sounds like you didn't config the hadoop HA correctly on

Re:Re: Flink connect hive with hadoop HA

2020-02-10 Thread sunfulin
Hi ,guys Thanks for kind reply. Actually I want to know how to change client side haddop conf while using table API within my program. Hope some useful sug. At 2020-02-11 02:42:31, "Bowen Li" wrote: Hi sunfulin, Sounds like you didn't config the hadoop HA correctly on

Re: Flink connect hive with hadoop HA

2020-02-10 Thread Bowen Li
Hi sunfulin, Sounds like you didn't config the hadoop HA correctly on the client side according to [1]. Let us know if it helps resolve the issue. [1] https://stackoverflow.com/questions/25062788/namenode-ha-unknownhostexception-nameservice1 On Mon, Feb 10, 2020 at 7:11 AM Khachatryan Roman

Re: Flink connect hive with hadoop HA

2020-02-10 Thread Bowen Li
Hi sunfulin, Sounds like you didn't config the hadoop HA correctly on the client side according to [1]. Let us know if it helps resolve the issue. [1] https://stackoverflow.com/questions/25062788/namenode-ha-unknownhostexception-nameservice1 On Mon, Feb 10, 2020 at 7:11 AM Khachatryan Roman

Re: Flink connect hive with hadoop HA

2020-02-10 Thread Khachatryan Roman
Hi, Could you please provide a full stacktrace? Regards, Roman On Mon, Feb 10, 2020 at 2:12 PM sunfulin wrote: > Hi, guys > I am using Flink 1.10 and test functional cases with hive intergration. > Hive with 1.1.0-cdh5.3.0 and with hadoop HA enabled.Running flink job I can > se

Flink connect hive with hadoop HA

2020-02-10 Thread sunfulin
Hi, guys I am using Flink 1.10 and test functional cases with hive intergration. Hive with 1.1.0-cdh5.3.0 and with hadoop HA enabled.Running flink job I can see successful connection with hive metastore, but cannot read table data with exception: java.lang.IllegalArgumentException

Flink connect hive with hadoop HA

2020-02-10 Thread sunfulin
Hi, guys I am using Flink 1.10 and test functional cases with hive intergration. Hive with 1.1.0-cdh5.3.0 and with hadoop HA enabled.Running flink job I can see successful connection with hive metastore, but cannot read table data with exception: java.lang.IllegalArgumentException

Re: Location of flink-s3-fs-hadoop plugin (Flink 1.9.0 on EMR)

2020-01-28 Thread Arvid Heise
└── s3 (name is arbitrary) └── flink-s3-fs-hadoop.jar On Tue, Jan 28, 2020 at 9:18 AM Arvid Heise wrote: > Hi Aaron, > > I encountered a similar issue when running on EMR. On the slaves, there > are some lingering hadoop versions that are older than 2.7 (it was 2.6 if

Re: Location of flink-s3-fs-hadoop plugin (Flink 1.9.0 on EMR)

2020-01-28 Thread Arvid Heise
Hi Aaron, I encountered a similar issue when running on EMR. On the slaves, there are some lingering hadoop versions that are older than 2.7 (it was 2.6 if I remember correctly), which bleed into the classpath of Flink. Flink checks the Hadoop version to check if certain capabilities like file

Re: blink(基于flink1.5.1版本)可以使用两个hadoop集群吗?

2020-01-26 Thread Yun Tang
/yarn_setup.html#background--internals 祝好 唐云 From: Yong Sent: Wednesday, January 22, 2020 14:53 To: dev ; user-zh Subject: blink(基于flink1.5.1版本)可以使用两个hadoop集群吗? 大家好, flink可以使用两个hadoop集群吗? 背景如下: 目前我这边基于blink搭建了flink standalone集群,状态存储使用公司的hadoop hdfs 并且使用了kerberos认证

Re: Location of flink-s3-fs-hadoop plugin (Flink 1.9.0 on EMR)

2020-01-24 Thread Aaron Langford
This seems to confirm that the S3 file system implementation is not being loaded when you start your job. Can you share the details of how you are getting the flink-s3-fs-hadoop artifact onto your cluster? Are you simply ssh-ing to the master node and doing this manually? Are you doing this via

Re: Location of flink-s3-fs-hadoop plugin (Flink 1.9.0 on EMR)

2020-01-24 Thread Senthil Kumar
e.org" Subject: Re: Location of flink-s3-fs-hadoop plugin (Flink 1.9.0 on EMR) When creating your cluster, you can provide configurations that EMR will find the right home for. Example for the aws cli: aws emr create-cluster ... --configurations '[{ "Classification": "flin

Re: Location of flink-s3-fs-hadoop plugin (Flink 1.9.0 on EMR)

2020-01-23 Thread Aaron Langford
pId": "", > "Configurations": [{ > "Classification": "flink-log4j", > "Properties": { > "log4j.rootLogger": "DEBUG,file" > } > },{ > "Classi

Re: Location of flink-s3-fs-hadoop plugin (Flink 1.9.0 on EMR)

2020-01-23 Thread Senthil Kumar
Could you tell us how to turn on debug level logs? We attempted this (on driver) sudo stop hadoop-yarn-resourcemanager followed the instructions here https://stackoverflow.com/questions/27853974/how-to-set-debug-log-level-for-resourcemanager and sudo start hadoop-yarn-resourcemanager but we

blink(????flink1.5.1????)????????????hadoop????????

2020-01-21 Thread Yong
flinkhadoop ?? ??blink??flink standalonehadoop hdfs ??kerberos??TM??jobHadoop YARN??flink

Re: Location of flink-s3-fs-hadoop plugin (Flink 1.9.0 on EMR)

2020-01-21 Thread Aaron Langford
thil Kumar wrote: > Yang, I appreciate your help! Please let me know if I can provide with any > other info. > > > > I resubmitted my executable jar file as a step to the flink EMR and here’s > are all the exceptions. I see two of them. > > > > I fished them out of /v

Re: Location of flink-s3-fs-hadoop plugin (Flink 1.9.0 on EMR)

2020-01-21 Thread Senthil Kumar
Yang, I appreciate your help! Please let me know if I can provide with any other info. I resubmitted my executable jar file as a step to the flink EMR and here’s are all the exceptions. I see two of them. I fished them out of /var/log/Hadoop//syslog 2020-01-21 16:31:37,587 ERROR

Re: Location of flink-s3-fs-hadoop plugin (Flink 1.9.0 on EMR)

2020-01-18 Thread Yang Wang
I think this exception is not because the hadoop version isn't high enough. It seems that the "s3" URI scheme could not be recognized by `S3FileSystemFactory`. So it fallbacks to the `HadoopFsFactory`. Could you share the debug level jobmanager/taskmanger logs so that we could confi

Location of flink-s3-fs-hadoop plugin (Flink 1.9.0 on EMR)

2020-01-17 Thread Senthil Kumar
Hello all, Newbie here! We are running in Amazon EMR with the following installed in the EMR Software Configuration Hadoop 2.8.5 JupyterHub 1.0.0 Ganglia 3.7.2 Hive 2.3.6 Flink 1.9.0 I am trying to get a Streaming job from one S3 bucket into an another S3 bucket using

Re: flink是否可以通过代码设置hadoop的配置文件目录

2020-01-12 Thread LakeShen
son. > > > LJY 于2020年1月9日周四 下午3:52写道: > > > 各位好: > > > > 目前hadoop的配置文件是在 fs.hdfs.hadoopconf 设置。 > > > > 用户是否能够不启用配置文件中的fs.hdfs.hadoopconf,通过代码手动设置hadoop的目录。 >

Re: 关于flink和hadoop版本的问题

2019-12-06 Thread jingwen jingwen
没有什么问题的,只是hadoop2.8和hadoop3.0在一些特性上存在不一样,对于你正常使用flink不受影响 下载flink源码打包一直没有编译成功,需要看下问题的原因,可能是一些maven的源的问题 cljb...@163.com 于2019年12月6日周五 下午4:14写道: > 您好: > > 问一个关于flink和hadoop版本的问题。目前我们生产环境是hadoop3.0+的版本,现在官网上flink1.9+没有直接打包好的捆绑的hadoop3.0+的版本。 > 但是我自己下载flink1.9.1版本,然后下载了 可选组件里的

关于flink和hadoop版本的问题

2019-12-06 Thread cljb...@163.com
您好: 问一个关于flink和hadoop版本的问题。目前我们生产环境是hadoop3.0+的版本,现在官网上flink1.9+没有直接打包好的捆绑的hadoop3.0+的版本。 但是我自己下载flink1.9.1版本,然后下载了 可选组件里的 Pre-bundled Hadoop 2.8.3 (asc, sha1) ,并且将这个包放到flink的lib下,也是可以正常操作hadoop的。 请问这样有什么影响吗? 因为自己下载flink源码打包一直没有编译成功。麻烦告知! 感谢! 陈军 cljb...@163.com

Re: Building with Hadoop 3

2019-12-04 Thread vino yang
Hi Marton, Thanks for your explanation. Personally, I look forward to your contribution! Best, Vino Márton Balassi 于2019年12月4日周三 下午5:15写道: > Wearing my Cloudera hat I can tell you that we have done this exercise for > our distros of the 3.0 and 3.1 Hadoop versions. We have not contr

Re: Building with Hadoop 3

2019-12-04 Thread Márton Balassi
Wearing my Cloudera hat I can tell you that we have done this exercise for our distros of the 3.0 and 3.1 Hadoop versions. We have not contributed these back just yet, but we are open to do so. If the community is interested we can contribute those changes back to flink-shaded and suggest

Re: Building with Hadoop 3

2019-12-04 Thread Chesnay Schepler
There's no JIRA and no one actively working on it. I'm not aware of any investigations on the matter; hence the first step would be to just try it out. A flink-shaded artifact isn't a hard requirement; Flink will work with any 2.X hadoop distribution (provided that there aren't any dependency

Re: Building with Hadoop 3

2019-12-03 Thread vino yang
cc @Chesnay Schepler to answer this question. Foster, Craig 于2019年12月4日周三 上午1:22写道: > Hi: > > I don’t see a JIRA for Hadoop 3 support. I see a comment on a JIRA here > from a year ago that no one is looking into Hadoop 3 support [1]. Is there > a document or JIRA that now exi

Building with Hadoop 3

2019-12-03 Thread Foster, Craig
Hi: I don’t see a JIRA for Hadoop 3 support. I see a comment on a JIRA here from a year ago that no one is looking into Hadoop 3 support [1]. Is there a document or JIRA that now exists which would point to what needs to be done to support Hadoop 3? Right now builds with Hadoop 3 don’t work

Re: flink1.9.1编译hadoop版本3.0.0-cdh6.0.1,on yarn报错

2019-10-24 Thread renc law
老铁可以分享下编译部署的经验么,最近也打算在cdh的6.2.0上装套1.8.1的 > 在 2019年10月25日,上午10:31,homex wu 写道: > >  > 权限问题,乌龙了。 > > homex wu 于2019年10月25日周五 上午9:42写道: >> 场景:从1.8开始编译的hadoop版本为3.0.0-cdh6.0.1的,执行./bin/yarn-session.sh -tm 1024 -s 4 >> 开始会报错,退回1.7.2就正常了,现在用1.9.1版本编译的3.0.0-cdh6.0.1,

Re: flink1.9.1编译hadoop版本3.0.0-cdh6.0.1,on yarn报错

2019-10-24 Thread homex wu
权限问题,乌龙了。 homex wu 于2019年10月25日周五 上午9:42写道: > 场景:从1.8开始编译的hadoop版本为3.0.0-cdh6.0.1的,执行./bin/yarn-session.sh -tm 1024 -s 4 > 开始会报错,退回1.7.2就正常了,现在用1.9.1版本编译的3.0.0-cdh6.0.1,执行on yarn还是报错, > 是否是hadoop版本有问题呢? > > 已设置环境变量: > export HADOOP_CLASSPATH=/opt/cloudera/parcels/CDH/lib/h

Re: How to write stream data to other Hadoop Cluster by StreamingFileSink

2019-10-09 Thread Jun Zhang
Hi,Yang : thank you very much for your reply. I had add the configurations on my hadoop cluster client , both hdfs-site.xml and core-site.xml are configured, the client can read mycluster1 and mycluter2, but when I submit the flink job to yarn cluster , the hadoop client configurations

Re: How to write stream data to other Hadoop Cluster by StreamingFileSink

2019-10-09 Thread Jun Zhang
Hi,Yang : thank you very much for your reply. I had add the configurations on my hadoop cluster client , both hdfs-site.xml and core-site.xml are configured, the client can read mycluster1 and mycluter2, but when I submit the flink job to yarn cluster , the hadoop client configurations

Re: How to write stream data to other Hadoop Cluster by StreamingFileSink

2019-10-08 Thread Yang Wang
: org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.namenode.rpc-address.mycluster2.nn1: nn1-address dfs.namenode.rpc-address.mycluster2.nn2: nn1-address Best, Yang Jun Zhang <825875...@qq.com> 于2019年10月5日周六 下午1:45写道: > > Hi,all: > > I have 2 hadoop cluster

How to write stream data to other Hadoop Cluster by StreamingFileSink

2019-10-04 Thread Jun Zhang
Hi,all: I have 2 hadoop cluster (hdfs://mycluster1 and hdfs://mycluster2),both of them configured the HA, I have a job ,read from streaming data from kafka, and write it to hdfs by StreamingFileSink,now I deployed my job on mycluster1 (flink on yarn),and I want to write the data to mycluster2

How to write stream data to other Hadoop Cluster by StreamingFileSink

2019-10-04 Thread Jun Zhang
Hi,all: I have 2 hadoop cluster (hdfs://mycluster1 and hdfs://mycluster2),both of them configured the HA, I have a job ,read from streaming data from kafka, and write it to hdfs by StreamingFileSink,now I deployed my job on mycluster1 (flink on yarn),and I want to write the data to mycluster2

Re: best practices on getting flink job logs from Hadoop history server?

2019-09-05 Thread Yang Wang
orary files of the YARN session in the home >> directory will not be removed. >> >> Best >> Yun Tang >> >> -- >> *From:* Zhu Zhu >> *Sent:* Friday, August 30, 2019 16:24 >> *To:* Yu Yang >> *Cc:* user >> *S

Re: best practices on getting flink job logs from Hadoop history server?

2019-09-05 Thread Yu Yang
l not be removed. > > Best > Yun Tang > > -- > *From:* Zhu Zhu > *Sent:* Friday, August 30, 2019 16:24 > *To:* Yu Yang > *Cc:* user > *Subject:* Re: best practices on getting flink job logs from Hadoop > history server? > > Hi Yu, > > Rega

Re: Build Flink against a vendor specific Hadoop version

2019-08-31 Thread Chesnay Schepler
yourself in your local maven installation. On 29/08/2019 18:48, Stephan Ewen wrote: The easiest thing is to build Flink against a specific Hadoop version at all, but just to take plain Flink (Hadoop free) and export the HADOOP_CLASSPATH variable to point to the vendor libraries. Does that wor

Re: Build Flink against a vendor specific Hadoop version

2019-08-30 Thread Elise RAMÉ
Ewen wrote: >> The easiest thing is to build Flink against a specific Hadoop version at >> all, but just to take plain Flink (Hadoop free) and export the >> HADOOP_CLASSPATH variable to point to the vendor libraries. >> >> Does that work for you? >>

Re: best practices on getting flink job logs from Hadoop history server?

2019-08-30 Thread Yun Tang
note that the temporary files of the YARN session in the home directory will not be removed. Best Yun Tang From: Zhu Zhu Sent: Friday, August 30, 2019 16:24 To: Yu Yang Cc: user Subject: Re: best practices on getting flink job logs from Hadoop history server? Hi

Re: best practices on getting flink job logs from Hadoop history server?

2019-08-30 Thread Zhu Zhu
multiple TM logs. However it can be much smaller than the "yarn logs ..." generated log. Thanks, Zhu Zhu Yu Yang 于2019年8月30日周五 下午3:58写道: > Hi, > > We run flink jobs through yarn on hadoop clusters. One challenge that we > are facing is to simplify flink job log access. >

best practices on getting flink job logs from Hadoop history server?

2019-08-30 Thread Yu Yang
Hi, We run flink jobs through yarn on hadoop clusters. One challenge that we are facing is to simplify flink job log access. The flink job logs can be accessible using "yarn logs $application_id". That approach has a few limitations: 1. It is not straightforward to find yarn appl

Flink fs s3 shaded hadoop: KerberosAuthException when using StreamingFileSink to S3

2019-08-08 Thread Achyuth Narayan Samudrala
Hi, We are trying to use StreamingFileSink to write to a S3 bucket. Its a simple job which reads from Kafka and sinks to S3. The credentials for s3 are configured in the flink cluster. We are using flink 1.7.2 without pre bundled hadoop. As suggested in the documentation we have added the flink

Re: flink filesystem 1.7.2 on Hadoop 2.7 BucketingSink.reflectTruncat() 有写入很多小文件到hdfs的风险

2019-06-24 Thread wxy
感谢大佬的解决,1.8确实已经修改 > On Jun 24, 2019, at 3:49 PM, Biao Liu wrote: > > 你好,看了下代码,1.7.2 确实有这问题,最新的代码已经 fix,见[1] > 如果可以的话,升级到1.8.0就包含了该 fixing > > 1. > https://github.com/apache/flink/commit/24c2e17c8d52ae2f0f897a5806a3a44fdf62b0a5 > > 巫旭阳 于2019年6月24日周一 下午2:40写道: > >> 源码在 BucketingSink 615行 >>

Re: flink filesystem 1.7.2 on Hadoop 2.7 BucketingSink.reflectTruncat() 有写入很多小文件到hdfs的风险

2019-06-24 Thread Biao Liu
你好,看了下代码,1.7.2 确实有这问题,最新的代码已经 fix,见[1] 如果可以的话,升级到1.8.0就包含了该 fixing 1. https://github.com/apache/flink/commit/24c2e17c8d52ae2f0f897a5806a3a44fdf62b0a5 巫旭阳 于2019年6月24日周一 下午2:40写道: > 源码在 BucketingSink 615行 > Path testPath = new Path(basePath, UUID.randomUUID().toString()); > try

flink filesystem 1.7.2 on Hadoop 2.7 BucketingSink.reflectTruncat() 有写入很多小文件到hdfs的风险

2019-06-24 Thread 巫旭阳
源码在 BucketingSink 615行 Path testPath = new Path(basePath, UUID.randomUUID().toString()); try (FSDataOutputStream outputStream = fs.create(testPath)) { outputStream.writeUTF("hello"); } catch (IOException e) { LOG.error("Could not create file for checking if truncate works.", e); throw new

Re: Flink 1.7.1 flink-s3-fs-hadoop-1.7.1 doesn't delete older chk- directories

2019-06-07 Thread anaray
Hi Fabian, Thank you. Your observation is correct. The stale directories belong to the failed checkpoints. So it is related to FLINK-10855. I will closely follow FLINK-10855 and test when fix is available Thank You, anaray -- Sent from:

Re: Flink 1.7.1 flink-s3-fs-hadoop-1.7.1 doesn't delete older chk- directories

2019-06-07 Thread Fabian Hueske
and describe the bug? Thank you, Fabian [1] https://issues.apache.org/jira/browse/FLINK-10855 Am Do., 6. Juni 2019 um 21:04 Uhr schrieb anaray : > Hi, > > I am using 1.7.1 and we store checkpoints in Ceph and we use > flink-s3-fs-hadoop-1.7.1 to connect to Ceph. I have only 1 checkpoin

Flink 1.7.1 flink-s3-fs-hadoop-1.7.1 doesn't delete older chk- directories

2019-06-06 Thread anaray
Hi, I am using 1.7.1 and we store checkpoints in Ceph and we use flink-s3-fs-hadoop-1.7.1 to connect to Ceph. I have only 1 checkpoint retained. Issue I see is that previous/old chk- directories are still around. I verified that those older doesn't contain any checkpoint data. But the directories

<    1   2   3   4   >