Re: 配置hadoop依赖问题

2024-04-01 Thread Biao Geng
Hi fengqi, “Hadoop is not in the classpath/dependencies.”报错说明org.apache.hadoop.conf.Configuration和org.apache.hadoop.fs.FileSystem这些hdfs所需的类没有找到。 如果你的系统环境中有hadoop的话,通常是用这种方式来设置classpath: export HADOOP_CLASSPATH=`hadoop classpath` 如果你的提交方式是提交到本地一个standalone的flink集群的话,可以检查下flink生成的日志文件,里面会打印

Re: [DISCUSS] Hadoop 2 vs Hadoop 3 usage

2024-01-15 Thread Yang Wang
I could share some metrics about Alibaba Cloud EMR clusters. The ratio of Hadoop2 VS Hadoop3 is 1:3. Best, Yang On Thu, Dec 28, 2023 at 8:16 PM Martijn Visser wrote: > Hi all, > > I want to get some insights on how many users are still using Hadoop 2 > vs how many users are us

[DISCUSS] Hadoop 2 vs Hadoop 3 usage

2023-12-28 Thread Martijn Visser
Hi all, I want to get some insights on how many users are still using Hadoop 2 vs how many users are using Hadoop 3. Flink currently requires a minimum version of Hadoop 2.10.2 for certain features, but also extensively uses Hadoop 3 (like for the file system implementations) Hadoop 2 has

Re: Hadoop Error on ECS Fargate

2023-07-17 Thread Martijn Visser
Hi Mengxi Wang, Which Flink version are you using? Best regards, Martijn On Thu, Jul 13, 2023 at 3:21 PM Wang, Mengxi X via user < user@flink.apache.org> wrote: > Hi community, > > > > We got this kuerberos error with Hadoop as file system on ECS Fargate > de

Hadoop Error on ECS Fargate

2023-07-13 Thread Wang, Mengxi X via user
Hi community, We got this kuerberos error with Hadoop as file system on ECS Fargate deployment. Caused by: org.apache.hadoop.security.KerberosAuthException: failure to login: javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid null input: name Caused

Re: fail to mount hadoop-config-volume when using flink-k8s-operator

2022-10-13 Thread Yang Wang
Currently, exporting the env "HADOOP_CONF_DIR" could only work for native K8s integration. The flink client will try to create the hadoop-config-volume automatically if hadoop env found. If you want to set the HADOOP_CONF_DIR in the docker image, please also make sure the specified h

fail to mount hadoop-config-volume when using flink-k8s-operator

2022-10-12 Thread Liting Liu (litiliu)
Hi, community: I'm using flink-k8s-operator v1.2.0 to deploy flink job. And the "HADOOP_CONF_DIR" environment variable was setted in the image that i buiilded from flink:1.15. I found the taskmanager pod was trying to mount a volume named "hadoop-config-volume&

Setting boundedness for legacy Hadoop sequence file sources

2022-05-03 Thread Ken Krugler
Hi all, I’m converting several batch Flink workflows to streaming, with bounded sources. Some of our sources are reading Hadoop sequence files via StreamExecutionEnvironment.createInput(HadoopInputFormat). The problem is that StreamGraphGenerator.existsUnboundedSource is returning true

Re: [statefun] hadoop dependencies and StatefulFunctionsConfigValidator

2022-03-09 Thread Igal Shilman
loaded over statefun's >> protobuf-java 3.7.1, and NoSuchMethod exceptions occur. >> > >> > We hacked together a version of statefun that doesn't perform the check >> whether the classloader settings contain the three patterns from above, and >> as long as our job u

Re: [statefun] hadoop dependencies and StatefulFunctionsConfigValidator

2022-03-08 Thread Filip Karnicki
uf-java 3.7.1, and NoSuchMethod exceptions occur. > > > > We hacked together a version of statefun that doesn't perform the check > whether the classloader settings contain the three patterns from above, and > as long as our job uses protobouf-java 3.7.1 and the com.google.protob

Re: [statefun] hadoop dependencies and StatefulFunctionsConfigValidator

2022-03-08 Thread Roman Khachatryan
and the com.google.protobuf > pattern is not present in the classloader.parent-first-patterns.additional > setting, then all is well. > > Aside from removing old hadoop from the classpath, which may not be possible > given that it's a shared cluster, is there anything we can do o

[statefun] hadoop dependencies and StatefulFunctionsConfigValidator

2022-03-04 Thread Filip Karnicki
patterns from above, and as long as our job uses protobouf-java 3.7.1 and the com.google.protobuf pattern is not present in the classloader.parent-first-patterns.additional setting, then all is well. Aside from removing old hadoop from the classpath, which may not be possible given that it's

Re: [DISCUSS] Changing the minimal supported version of Hadoop

2022-01-03 Thread David Morávek
As there were no strong objections, we'll proceed with bumping the Hadoop version to 2.8.5 and removing the safeguards and the CI for any earlier versions. This will effectively make the Hadoop 2.8.5 the least supported version in Flink 1.15. Best, D. On Thu, Dec 23, 2021 at 11:03 AM Till

Re: [DISCUSS] Changing the minimal supported version of Hadoop

2021-12-23 Thread Till Rohrmann
If there are no users strongly objecting to dropping Hadoop support for < 2.8, then I am +1 for this since otherwise we won't gain a lot as Xintong said. Cheers, Till On Wed, Dec 22, 2021 at 10:33 AM David Morávek wrote: > Agreed, if we drop the CI for lower versions, there is ac

Re: [DISCUSS] Changing the minimal supported version of Hadoop

2021-12-22 Thread David Morávek
Agreed, if we drop the CI for lower versions, there is actually no point of having safeguards as we can't really test for them. Maybe one more thought (it's more of a feeling), I feel that users running really old Hadoop versions are usually slower to adopt (they most likely use what the current

Re: [DISCUSS] Changing the minimal supported version of Hadoop

2021-12-21 Thread Xintong Song
Sorry to join the discussion late. +1 for dropping support for hadoop versions < 2.8 from my side. TBH, warping the reflection based logic with safeguards sounds a bit neither fish nor fowl to me. It weakens the major benefits that we look for by dropping support for early versi

Re: [DISCUSS] Changing the minimal supported version of Hadoop

2021-12-21 Thread David Morávek
CC user@f.a.o Is anyone aware of something that blocks us from doing the upgrade? D. On Tue, Dec 21, 2021 at 5:50 PM David Morávek wrote: > Hi Martijn, > > from person experience, most Hadoop users are lagging behind the release > lines by a lot, because upgrading a Ha

Re: Passing arbitrary Hadoop s3a properties from FileSystem SQL Connector options

2021-12-15 Thread Arvid Heise
he effort will be >> tracked under the following ticket: >> >> https://issues.apache.org/jira/browse/FLINK-19589 >> >> I will loop-in Arvid (in CC) which might help you in contributing the >> missing functioniality. >> >> Regards, >> Timo >>

Re: Passing arbitrary Hadoop s3a properties from FileSystem SQL Connector options

2021-12-13 Thread Timothy James
> Regards, > Timo > > > On 10.12.21 23:48, Timothy James wrote: > > Hi, > > > > The Hadoop s3a library itself supports some properties we need, but the > > "FileSystem SQL Connector" (via FileSystemTableFactory) does not pass > > connec

Re: Passing arbitrary Hadoop s3a properties from FileSystem SQL Connector options

2021-12-13 Thread Timo Walther
23:48, Timothy James wrote: Hi, The Hadoop s3a library itself supports some properties we need, but the "FileSystem SQL Connector" (via FileSystemTableFactory) does not pass connector options for these to the "Hadoop/Presto S3 File Systems plugins" (via S3FileSystemFacto

Passing arbitrary Hadoop s3a properties from FileSystem SQL Connector options

2021-12-10 Thread Timothy James
Hi, The Hadoop s3a library itself supports some properties we need, but the "FileSystem SQL Connector" (via FileSystemTableFactory) does not pass connector options for these to the "Hadoop/Presto S3 File Systems plugins" (via S3FileSystemFactory). Instead, only Job-global

Re: Issue with Flink jobs after upgrading to Flink 1.13.1/Ververica 2.5 - java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration

2021-12-08 Thread Natu Lauchande
thing you could try is removing the packaged parquet format and > defining a custom format[1]. For this custom format you can then fix the > dependencies by packaging all of the following into the format: > > * flink-sql-parquet > * flink-shaded-hadoop-2-uber > * hadoop-aws > *

Re: Issue with Flink jobs after upgrading to Flink 1.13.1/Ververica 2.5 - java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration

2021-12-08 Thread Ingo Bürk
Hi Natu, Something you could try is removing the packaged parquet format and defining a custom format[1]. For this custom format you can then fix the dependencies by packaging all of the following into the format: * flink-sql-parquet * flink-shaded-hadoop-2-uber * hadoop-aws * aws-java-sdk

Re: Issue with Flink jobs after upgrading to Flink 1.13.1/Ververica 2.5 - java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration

2021-12-06 Thread Natu Lauchande
: building the image with hadoop-client libraries) : java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration at java.lang.Class.getDeclaredConstructors0(Native Method) at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671) at java.lang.Class.getDeclaredConstructors

Re: Replacing S3 Client in Hadoop plugin

2021-11-23 Thread Martijn Visser
Hi Tamir, Thanks for providing the information. I don't know of a current solution right now, perhaps some other user has an idea, but I do find your input valuable for future improvements with regards to the S3 Client in Hadoop. Best regards, Martijn On Fri, 19 Nov 2021 at 09:21, Tamir Sagi

flink1.12 请教下如何配置多hadoop参数,s3使用问题

2021-11-22 Thread RS
hi, 环境: 1. flink-1.12,版本可以升级 2. flink-conf中配置了env.hadoop.conf.dir,路径下有hdfs集群的core-site.xml和hdfs-site.xml, state.backend保存在该HDFS上 3. flink的部署模式是K8S+session 需求: 需要从一个s3协议的分布式文件系统中读取文件,处理完写到mysql中 问题: s3配置采用hadoop的配置方式,保存为一个新的core-site.xml文件,参考的 https://hadoop.apache.org/docs/stable/hadoop

Replacing S3 Client in Hadoop plugin

2021-11-19 Thread Tamir Sagi
Hey Martijn, sorry for late respond. We wanted to replace the default client with our custom S3 client and not use the AmazonS3Client provided by the plugin. We used Flink-s3-fs-hadoop v1.12.2 and for our needs we had to upgrade to v1.14.0 [1]. AmazonS3 client factory is initialized[2

Re: Replacing S3 Client in Hadoop plugin

2021-10-13 Thread Martijn Visser
Hi, Could you elaborate on why you would like to replace the S3 client? Best regards, Martijn On Wed, 13 Oct 2021 at 17:18, Tamir Sagi wrote: > I found the dependency > > > org.apache.hadoop > hadoop-aws > 3.3.1 > > > apparently its

Re: Replacing S3 Client in Hadoop plugin

2021-10-13 Thread Tamir Sagi
I found the dependency org.apache.hadoop hadoop-aws 3.3.1 apparently its possible, there is a method setAmazonS3Client I think I found the solution. Thanks. Tamir. From: Tamir Sagi Sent: Wednesday, October 13, 2021 5:44 PM To: user

Replacing S3 Client in Hadoop plugin

2021-10-13 Thread Tamir Sagi
Hey community. I would like to know if there is any way to replace the S3 client in Hadoop plugin[1] to a custom client(AmazonS3). I did notice that Hadoop plugin supports replacing the implementation of S3AFileSystem using "fs.s3a.impl" (in flink-conf.yaml it will b

Re: Any one can help me? How to connect offline hadoop cluster and realtime hadoop cluster by different hive catalog?

2021-08-27 Thread Caizhi Weng
Hi! It seems that your Flink cluster cannot connect to realtime-cluster-master001/xx.xx.xx.xx:8050. Please check your network and port status. Jim Chen 于2021年8月27日周五 下午2:20写道: > Hi, All > My flink version is 1.13.1 and my company have two hadoop cluster, > offline hadoop cluster and

Any one can help me? How to connect offline hadoop cluster and realtime hadoop cluster by different hive catalog?

2021-08-27 Thread Jim Chen
Hi, All My flink version is 1.13.1 and my company have two hadoop cluster, offline hadoop cluster and realtime hadoop cluster. Now, on realtime hadoop cluster, we want to submit flink job to connect offline hadoop cluster by different hive catalog. I use different hive configuration diretory

Re: Issue with Flink jobs after upgrading to Flink 1.13.1/Ververica 2.5 - java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration

2021-07-23 Thread Flavio Pompermaier
ng on. Until then, a > workaround could be to add Hadoop manually and set the HADOOP_CLASSPATH > environment variable. The root cause seems that Hadoop cannot be found. > > Alternatively, you could also build a custom image and include Hadoop in > the lib folder of Flink: > > https:/

Re: Issue with Flink jobs after upgrading to Flink 1.13.1/Ververica 2.5 - java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration

2021-07-22 Thread Timo Walther
Thanks, this should definitely work with the pre-packaged connectors of Ververica platform. I guess we have to investigate what is going on. Until then, a workaround could be to add Hadoop manually and set the HADOOP_CLASSPATH environment variable. The root cause seems that Hadoop cannot

Re: Issue with Flink jobs after upgrading to Flink 1.13.1/Ververica 2.5 - java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration

2021-07-22 Thread Natu Lauchande
No custom code all through Flink SQL on UI no jars. > > > > Thanks, > > Natu > > > > On Thu, Jul 22, 2021 at 2:08 PM Timo Walther > <mailto:twal...@apache.org>> wrote: > > > > Hi Natu, > > > > Ververica Platform 2.5 has updated

Re: Issue with Flink jobs after upgrading to Flink 1.13.1/Ververica 2.5 - java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration

2021-07-22 Thread Timo Walther
1.13.1 as the supported flink version. No custom code all through Flink SQL on UI no jars. Thanks, Natu On Thu, Jul 22, 2021 at 2:08 PM Timo Walther <mailto:twal...@apache.org>> wrote: Hi Natu, Ververica Platform 2.5 has updated the bundled Hadoop version but this should n

Re: Issue with Flink jobs after upgrading to Flink 1.13.1/Ververica 2.5 - java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration

2021-07-22 Thread Natu Lauchande
Walther wrote: > Hi Natu, > > Ververica Platform 2.5 has updated the bundled Hadoop version but this > should not result in a NoClassDefFoundError exception. How are you > submitting your SQL jobs? You don't use Ververica's SQL service but have > built a regular JAR file, right? I

Re: Issue with Flink jobs after upgrading to Flink 1.13.1/Ververica 2.5 - java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration

2021-07-22 Thread Timo Walther
Hi Natu, Ververica Platform 2.5 has updated the bundled Hadoop version but this should not result in a NoClassDefFoundError exception. How are you submitting your SQL jobs? You don't use Ververica's SQL service but have built a regular JAR file, right? If this is the case, can you share your

Issue with Flink jobs after upgrading to Flink 1.13.1/Ververica 2.5 - java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration

2021-07-22 Thread Natu Lauchande
to FAILED on 10.243.3.0:42337-2a3224 @ 10-243-3-0.flink-metrics.vvp-jobs.svc.cluster.local (dataPort=39309). java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration at java.lang.Class.getDeclaredConstructors0(Native Method) ~[?:1.8.0_292

Re: Failure running Flink locally with flink-s3-fs-hadoop + AWS SDK v2 as a dependency

2021-07-20 Thread Yaroslav Tkachenko
. Could this be a LocalStreamEnvironment limitation? Is there any way to enable plugin loading locally? Thanks! On 2021/06/21 11:13:29, Yuval Itzchakov wrote: > Currently I have the s3-hadoop dependency in my build.sbt. > > I guess I need to move it to the PLUGIN directory locall

Re: Re:Re: Re: Flink 1.11.1 on k8s 如何配置hadoop

2021-05-23 Thread mts_geek
你好,我也遇到了这个问题,请问下你具体是如何打镜像的呢? 我是Dockerfile里添加 COPY --chown=flink:flink jars/flink-shaded-hadoop-2-2.8.3-10.0.jar $FLINK_HOME/lib/ 但是运行flink 1.11 on k8s的session cluster, jobserver能启动,但是提交job后报错。说不能初始化HadoopUtils, 但是flink-shaded-hadoop-2-2.8.3-10.0.jar里的确是有HadoopUtils这个class的。 Caused

Flink Hive connector: hive-conf-dir supports hdfs URI, while hadoop-conf-dir supports local path only?

2021-04-26 Thread Yik San Chan
Hi community, This question is cross-posted on Stack Overflow https://stackoverflow.com/questions/67264156/flink-hive-connector-hive-conf-dir-supports-hdfs-uri-while-hadoop-conf-dir-sup In my current setup, local dev env can access testing env. I would like to run Flink job on local dev env

Re: Flink Hadoop config on docker-compose

2021-04-22 Thread Matthias Pohl
ptions. It is only used to construct the classpath for the JM/TM process. >> However, in "HadoopUtils"[2] we do not support getting the hadoop >> configuration from classpath. >> >> >> [1]. >> https://github.com/apache/flink/blob/release-1.11/flink-dist/sr

Re: Flink Hadoop config on docker-compose

2021-04-22 Thread Flavio Pompermaier
at 4:52 AM Yang Wang wrote: >> >>> It seems that we do not export HADOOP_CONF_DIR as environment variables >>> in current implementation, even though we have set the env.xxx flink config >>> options. It is only used to construct the classpath for the JM/TM process. &

Re: Flink Hadoop config on docker-compose

2021-04-15 Thread Yang Wang
It seems that we do not export HADOOP_CONF_DIR as environment variables in current implementation, even though we have set the env.xxx flink config options. It is only used to construct the classpath for the JM/TM process. However, in "HadoopUtils"[2] we do not support getting

Re: Flink Hadoop config on docker-compose

2021-04-15 Thread Flavio Pompermaier
Hi Robert, indeed my docker-compose does work only if I add also Hadoop and yarn home while I was expecting that those two variables were generated automatically just setting env.xxx variables in FLINK_PROPERTIES variable.. I just want to understand what to expect, if I really need to specify

Re: Flink Hadoop config on docker-compose

2021-04-15 Thread Robert Metzger
Hi, I'm not aware of any known issues with Hadoop and Flink on Docker. I also tried what you are doing locally, and it seems to work: flink-jobmanager| 2021-04-15 18:37:48,300 INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint[] - Starting StandaloneSessionClusterEntrypoint

Flink Hadoop config on docker-compose

2021-04-14 Thread Flavio Pompermaier
Hi everybody, I'm trying to set up reading from HDFS using docker-compose and Flink 1.11.3. If I pass 'env.hadoop.conf.dir' and 'env.yarn.conf.dir' using FLINK_PROPERTIES (under environment section of the docker-compose service) I see in the logs the following line: "Could not find H

Re: Hadoop is not in the classpath/dependencies

2021-03-30 Thread Chesnay Schepler
This looks related to HDFS-12920; where Hadoop 2.X tries to read a duration from hdfs-default.xml expecting plain numbers, but in 3.x they also contain time units. On 3/30/2021 9:37 AM, Matthias Seiler wrote: Thank you all for the replies! I did as @Maminspapin suggested and indeed

Re: Hadoop is not in the classpath/dependencies

2021-03-30 Thread Matthias Seiler
t;30s" // this is thrown by the flink-shaded-hadoop library ``` I thought that it relates to the windowing I do, which has a slide interval of 30 seconds, but removing it displays the same error. I also added the dependency to the maven pom, but without effect. Since I use Hadoop 3.2.1, I also t

Re: Hadoop is not in the classpath/dependencies

2021-03-26 Thread Robert Metzger
Hey Matthias, Maybe the classpath contains hadoop libraries, but not the HDFS libraries? The "DistributedFileSystem" class needs to be accessible to the classloader. Can you check if that class is available? Best, Robert On Thu, Mar 25, 2021 at 11:10 AM Matthias Seiler <

????hadoop#configuration

2021-03-25 Thread ????
hi all onyarn31??flink-confhadoop yarn https://issues.apache.org/jira/browse/FLINK-21981 2??hadoop#configuration??yarnyarn??configuration??yarn??configurationconfiguration?? https

Re: Hadoop is not in the classpath/dependencies

2021-03-25 Thread Maminspapin
I downloaded the lib (last version) from here: https://repo.maven.apache.org/maven2/org/apache/flink/flink-shaded-hadoop-2-uber/2.8.3-7.0/ and put it in the flink_home/lib directory. It helped. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Hadoop is not in the classpath/dependencies

2021-03-25 Thread Maminspapin
I have the same problem ... -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Hadoop is not in the classpath/dependencies

2021-03-25 Thread Matthias Seiler
Hello everybody, I set up a a Flink (1.12.1) and Hadoop (3.2.1) cluster on two machines. The job should store the checkpoints on HDFS like so: ```java StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.enableCheckpointing(15000

Flink job manager HA 是否可以像 Hadoop Name Node 一样手动重启?

2021-03-21 Thread macdoor
Flink job manager HA 是否可以像 Hadoop Name Node 一样手动重启,同时保证集群正常运行? 我发现 job manager 占用内存似乎总是在缓慢不断增长,Hadoop Name Node 也有这个问题,我通过隔一段时间轮动重启Hadoop Name Node 解决这个问题,在HA模式下Flink job manager 是否可以轮动重启? -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: Hadoop Integration Link broken in downloads page

2021-03-10 Thread Till Rohrmann
Thanks a lot for reporting this problem Debraj. I've created a JIRA issue for it [1]. [1] https://issues.apache.org/jira/browse/FLINK-21723 Cheers, Till On Tue, Mar 9, 2021 at 5:28 AM Debraj Manna wrote: > Hi > > It appears the Hadoop Interation > <https://ci.apache.org/proje

Hadoop Integration Link broken in downloads page

2021-03-08 Thread Debraj Manna
Hi It appears the Hadoop Interation <https://ci.apache.org/projects/flink/flink-docs-release-1.12/ops/deployment/hadoop.html> link is broken on downloads <https://flink.apache.org/downloads.html> page. Apache Flink® 1.12.2 is our latest stable release. > If you plan to use Apache

[DISCUSS] Removal of flink-swift-fs-hadoop module

2021-01-26 Thread Robert Metzger
Hi all, during a security maintenance PR [1], Chesnay noticed that the flink-swift-fs-hadoop module is lacking test coverage [2]. Also, there hasn't been any substantial change since 2018, when it was introduced. On the user@ ML, I could not find any proof of significant use of the module (no one

[DISCUSS] Removal of flink-swift-fs-hadoop module

2021-01-26 Thread Robert Metzger
Hi all, during a security maintenance PR [1], Chesnay noticed that the flink-swift-fs-hadoop module is lacking test coverage [2]. Also, there hasn't been any substantial change since 2018, when it was introduced. On the user@ ML, I could not find any proof of significant use of the module (no one

Re: Re: Re: Flink写hdfs提交任务就报错。Recoverable writers on Hadoop are only supported for HDFS

2021-01-22 Thread 赵一旦
t; >> @Michael Ran; 嗯嗯,没关系。 >> >> >> >> @张锴 你说的是flink哪个版本的connector,stream or sql。我搜了下我的没有。我是1.12,stream。 >> >> >> >> >> 目前看文档有streamFileSink,还有FileSink,从文档内容来看使用方式差不多。我计划试一下FileSink,但不清楚FileSink和StreamFileSink啥区别,是否都能写hadoop类文件系统,因为涉及是否原子写

Re: Re: Re: Flink写hdfs提交任务就报错。Recoverable writers on Hadoop are only supported for HDFS

2021-01-22 Thread 赵一旦
gt;> @Michael Ran; 嗯嗯,没关系。 > >> > >> @张锴 你说的是flink哪个版本的connector,stream or sql。我搜了下我的没有。我是1.12,stream。 > >> > >> > 目前看文档有streamFileSink,还有FileSink,从文档内容来看使用方式差不多。我计划试一下FileSink,但不清楚FileSink和StreamFileSink啥区别,是否都能写hadoop类文件系统,因为涉及是否原子写,比较分布式文件系统不支持追加和编辑等。 > &

Re: Re: Re: Flink写hdfs提交任务就报错。Recoverable writers on Hadoop are only supported for HDFS

2021-01-21 Thread 张锴
>> >> 目前看文档有streamFileSink,还有FileSink,从文档内容来看使用方式差不多。我计划试一下FileSink,但不清楚FileSink和StreamFileSink啥区别,是否都能写hadoop类文件系统,因为涉及是否原子写,比较分布式文件系统不支持追加和编辑等。 >> >> Michael Ran 于2021年1月21日周四 下午7:01写道: >> >> > >> > >> 很抱歉,我已经很久没用过这个了。但是可以根据异常信息以及API源码执行进行分析,确定是

Re: Re: Re: Flink写hdfs提交任务就报错。Recoverable writers on Hadoop are only supported for HDFS

2021-01-21 Thread 张锴
我用的flink 1.10版,FlieSink就是BucketingSink,我是用这个写hdfs的 赵一旦 于2021年1月21日周四 下午7:05写道: > @Michael Ran; 嗯嗯,没关系。 > > @张锴 你说的是flink哪个版本的connector,stream or sql。我搜了下我的没有。我是1.12,stream。 > > 目前看文档有streamFileSink,还有FileSink,从文档内容来看使用方式差不多。我计划试一下FileSink,但不清楚FileSink和StreamFileSink啥区别,是否都能写hadoo

Re: Re: Re: Flink写hdfs提交任务就报错。Recoverable writers on Hadoop are only supported for HDFS

2021-01-21 Thread 赵一旦
@Michael Ran; 嗯嗯,没关系。 @张锴 你说的是flink哪个版本的connector,stream or sql。我搜了下我的没有。我是1.12,stream。 目前看文档有streamFileSink,还有FileSink,从文档内容来看使用方式差不多。我计划试一下FileSink,但不清楚FileSink和StreamFileSink啥区别,是否都能写hadoop类文件系统,因为涉及是否原子写,比较分布式文件系统不支持追加和编辑等。 Michael Ran 于2021年1月21日周四 下午7:01写道: > > 很抱歉,我已经很久没用过这个了。但是可以

Re:Re: Re: Flink写hdfs提交任务就报错。Recoverable writers on Hadoop are only supported for HDFS

2021-01-21 Thread Michael Ran
Writer(org.apache.hadoop.fs.FileSystem fs) {...} >> > 在 2021-01-21 17:18:23,"赵一旦" 写道: >> > >具体报错信息如下: >> > > >> > >java.lang.UnsupportedOperationException: Recoverable writers on Hadoop >&g

Re: Re: Flink写hdfs提交任务就报错。Recoverable writers on Hadoop are only supported for HDFS

2021-01-21 Thread 张锴
> 这里应该是用了hdfs 的特定API吧,文件系统没兼容public > > HadoopRecoverableWriter(org.apache.hadoop.fs.FileSystem fs) {...} > > 在 2021-01-21 17:18:23,"赵一旦" 写道: > > >具体报错信息如下: > > > > > >java.lang.UnsupportedOperationExcep

Re: Re: Flink写hdfs提交任务就报错。Recoverable writers on Hadoop are only supported for HDFS

2021-01-21 Thread 赵一旦
t; 写道: > >具体报错信息如下: > > > >java.lang.UnsupportedOperationException: Recoverable writers on Hadoop are > >only supported for HDFS > >at org.apache.flink.runtime.fs.hdfs.HadoopRecoverableWriter.( > >HadoopRecoverableWriter.java:61) > >at org.apache.flink.runtime.fs.hdfs.HadoopFil

Re:Re: Flink写hdfs提交任务就报错。Recoverable writers on Hadoop are only supported for HDFS

2021-01-21 Thread Michael Ran
这里应该是用了hdfs 的特定API吧,文件系统没兼容public HadoopRecoverableWriter(org.apache.hadoop.fs.FileSystem fs) {...} 在 2021-01-21 17:18:23,"赵一旦" 写道: >具体报错信息如下: > >java.lang.UnsupportedOperationException: Recoverable writers on Hadoop are >o

Re: Flink写hdfs提交任务就报错。Recoverable writers on Hadoop are only supported for HDFS

2021-01-21 Thread 赵一旦
除此以外,FlinkSQL读现有的hive数据仓库也是失败。配置okhive的catalog,表信息都能出来,但select操作就是失败。 赵一旦 于2021年1月21日周四 下午5:18写道: > 具体报错信息如下: > > java.lang.UnsupportedOperationException: Recoverable writers on Hadoop > are only supported for HDFS > at org.apache.flink.runtime.fs.hdfs.HadoopRe

Re: Flink写hdfs提交任务就报错。Recoverable writers on Hadoop are only supported for HDFS

2021-01-21 Thread 赵一旦
具体报错信息如下: java.lang.UnsupportedOperationException: Recoverable writers on Hadoop are only supported for HDFS at org.apache.flink.runtime.fs.hdfs.HadoopRecoverableWriter.( HadoopRecoverableWriter.java:61) at org.apache.flink.runtime.fs.hdfs.HadoopFileSystem .createRecoverableWriter

Flink写hdfs提交任务就报错。Recoverable writers on Hadoop are only supported for HDFS

2021-01-21 Thread 赵一旦
Recoverable writers on Hadoop are only supported for HDFS 如上,我们用的hadoop协议的,但底层不是hdfs,是公司自研的分布式文件系统。 使用spark写,spark-sql读等都没问题。但是flink写和读当前都没尝试成功。

??????flink-shaded-hadoop-2-uber????????????

2020-12-22 Thread superainbower
K8SHA??HDFS ??2020??12??22?? 13:43??liujian ?? Thanks,flink-confhistory server,??hdfs??,??web ui??, ---- ??:

?????? flink-shaded-hadoop-2-uber????????????

2020-12-21 Thread liujian
Thanks,flink-confhistory server,??hdfs??,??web ui??, ---- ??:

Re: flink-shaded-hadoop-2-uber版本如何选择

2020-12-21 Thread Yang Wang
> COPY ./jar/flink-shaded-hadoop-2-uber-2.8.3-10.0.jar > /opt/flink/lib/flink-shaded-hadoop-2-uber-2.8.3-10.0.jar > ENTRYPOINT ["/docker-entrypoint.sh"] > EXPOSE 6123 8081 8082 > CMD ["help","history-server"] > > ------

?????? flink-shaded-hadoop-2-uber????????????

2020-12-21 Thread liujian
Thanks, docker??,??Native K8s??,?? Dockerfile?? COPY ./jar/flink-shaded-hadoop-2-uber-2.8.3-10.0.jar /opt/flink/lib/flink-shaded-hadoop-2-uber-2.8.3-10.0.jar ENTRYPOINT ["/docker-entrypoint.sh"] EXPOSE

Re: flink-shaded-hadoop-2-uber版本如何选择

2020-12-20 Thread Yang Wang
"user-zh" > < > danrtsey...@gmail.com; > 发送时间:2020年12月21日(星期一) 上午10:15 > 收件人:"user-zh" > 主题:Re: flink-shaded-hadoop-2-uber版本如何选择 > > > > > 你不需要修改CMD,entrypoint默认是docker-entrypoint.sh[1],是支

?????? flink-shaded-hadoop-2-uber????????????

2020-12-20 Thread liujian
??history-server,,,??, ---- ??:

Re: flink-shaded-hadoop-2-uber版本如何选择

2020-12-20 Thread Yang Wang
< > danrtsey...@gmail.com; > 发送时间:2020年12月19日(星期六) 晚上9:35 > 收件人:"user-zh" > 主题:Re: flink-shaded-hadoop-2-uber版本如何选择 > > > > 你只需要在Flink Client端设置HADOOP_CONF_DIR的环境就可以了 > Flink > > Client会自动把hdfs-site.xml、core-site.xml文件通过创建一个单独ConfigMap,然

?????? flink-shaded-hadoop-2-uber????????????

2020-12-19 Thread liujian
Thanks, ??historyServer,flink 1.12.0??Dockerfile ??CMD ["history-server"] ??8082,?? ---- ??:

Re: flink-shaded-hadoop-2-uber版本如何选择

2020-12-19 Thread Yang Wang
你只需要在Flink Client端设置HADOOP_CONF_DIR的环境就可以了 Flink Client会自动把hdfs-site.xml、core-site.xml文件通过创建一个单独ConfigMap,然后挂载给JobManager和TaskManager的 同时这两个配置也会自动加载到classpath下,只需要lib下放了flink-shaded-hadoop,就不需要做其他事情,可以直接访问hdfs的 Best, Yang liujian <13597820...@qq.com> 于2020年12月19日周六 下午8:29写道: > > HD

?????? flink-shaded-hadoop-2-uber????????????

2020-12-19 Thread liujian
HDFS??Ha,hdfs-site.xml,,configMap??hdfs-site.xml??$FLINK_HOME/conf?? ---- ??: "user-zh"

Re: flink-shaded-hadoop-2-uber版本如何选择

2020-12-16 Thread Yang Wang
如果是在K8s上面访问hdfs,还是需要把flink-shaded-hadoop放到lib目录下,因为目前hadoop的FileSystem并不支持plugin加载 Best, Yang superainbower 于2020年12月16日周三 下午6:19写道: > 借楼请问下,部署到K8S上怎么访问HDFS呢,目前我还是把shaded的jar打到镜像里面去 > 在2020年12月16日 10:53,Yang Wang 写道: > > 以flink-shaded-hadoop-2-uber的2.8.3-10.0为例 > > 2.8.3指

回复:flink-shaded-hadoop-2-uber版本如何选择

2020-12-16 Thread superainbower
借楼请问下,部署到K8S上怎么访问HDFS呢,目前我还是把shaded的jar打到镜像里面去 在2020年12月16日 10:53,Yang Wang 写道: 以flink-shaded-hadoop-2-uber的2.8.3-10.0为例 2.8.3指的hadoop的版本,10.0指定的flink-shaded[1]的版本 社区从1.10开始不再推荐使用flink-shaded-hadoop的方式,而且通过设置HADOOP_CLASSPATH环境变量来提交[2], 这样可以让Flink变得hadoop free,从而同时支持hadoop2和hadoop3 如果你还坚持使用

Re: flink-shaded-hadoop-2-uber版本如何选择

2020-12-15 Thread Yang Wang
以flink-shaded-hadoop-2-uber的2.8.3-10.0为例 2.8.3指的hadoop的版本,10.0指定的flink-shaded[1]的版本 社区从1.10开始不再推荐使用flink-shaded-hadoop的方式,而且通过设置HADOOP_CLASSPATH环境变量来提交[2], 这样可以让Flink变得hadoop free,从而同时支持hadoop2和hadoop3 如果你还坚持使用flink-shaded-hadoop,那就建议使用最新的版本就可以了2.8.3-10.0 [1]. https://github.com/apache/flink

Re: flink-shaded-hadoop-2-uber*-* 版本确定问题

2020-12-15 Thread Yang Wang
你得确认hadoop classpath返回的是完整的,正常情况下hadoop classpath这个命令会把所有的hadoop jar都包含进去的 如果报类或者方法不存在需要确认相应的jar是否存在,并且包含进去了 社区推荐hadoop classpath的方式主要是想让Flink做到hadoop free,这样在hadoop2和hadoop3都可以正常运行了 Best, Yang Jacob <17691150...@163.com> 于2020年12月15日周二 上午9:25写道: > 谢谢回复! > > 这个文档我也有查看 > >

Re: flink-shaded-hadoop-2-uber*-* 版本确定问题

2020-12-14 Thread Jacob
谢谢回复! 这个文档我也有查看 前几日在flink1.9-1.12各个客户端测试提交job时候发现 对于1.10+的版本,我手动导入export HADOOP_CLASSPATH=`hadoop classpath`,没有效果,各种报错,基本都是Hadoop相关类、方法不存在(NoSuchMethod之类错误),把pom文件改来改去依然无用,后来只在pom文件中导入依赖:flink-shaded-hadoop-2-uber*-*,竟然可以正常提交并运行job了。 -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: flink-shaded-hadoop-2-uber*-* 版本确定问题

2020-12-13 Thread silence
flink已经不建议将hadoop的jar放到lib里了 可以通过 export HADOOP_CLASSPATH=`hadoop classpath` 加载hadoop的依赖 参考链接: https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/hadoop.html#providing-hadoop-classes -- Sent from: http://apache-flink.147419.n8.nabble.com/

flink-shaded-hadoop-2-uber*-* 版本确定问题

2020-12-12 Thread Jacob
请问在升级flink版本的过程中,需要在flink/lib里面引入该包,但该包的版本号如何确定? flink-shaded-hadoop-2-uber*-* -- Sent from: http://apache-flink.147419.n8.nabble.com/

flink-shaded-hadoop-2-uber版本如何选择

2020-12-10 Thread 赢峰
flink-shaded-hadoop-2-uber版本如何选择? xxx-xxx 分别表示什么意思?

Re: Re:Re: Re: Flink 1.11.1 on k8s 如何配置hadoop

2020-10-11 Thread Dream-底限
hi、可以去hadoop的一个节点直接打镜像哈,打镜像的时候把需要的hadoop依赖包、flink一起打包到docker里面,然后配置一下环境变量就可以用了;如果你的docker部署节点有hadoop或flink也可以直接外挂;目前我们使用的是第一种 Yang Wang 于2020年10月12日周一 上午10:23写道: > 只需要base社区的镜像,然后再加上一层(拷贝flink-shaded-hadoop),commit到docker > image,然后push到docker registry就可以了 > > 例如Dockerfile可以如下

Re: Re:Re: Re: Flink 1.11.1 on k8s 如何配置hadoop

2020-10-11 Thread Yang Wang
只需要base社区的镜像,然后再加上一层(拷贝flink-shaded-hadoop),commit到docker image,然后push到docker registry就可以了 例如Dockerfile可以如下 FROM flink:1.11.1-scala_2.11 COPY flink-shaded-hadoop-2*.jar /opt/flink/lib/ 另外,flink-shaded-hadoop可以从这里下载[1] [1]. https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-hadoop

Re: Re:Re: Re: Flink 1.11.1 on k8s 如何配置hadoop

2020-10-10 Thread yang
麻烦问一下,您是怎么从新打镜像的,是把原来的jar解压出来,然后在打包么? -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: Re:Re: Re: Flink 1.11.1 on k8s 如何配置hadoop

2020-10-10 Thread yang
麻烦问一下,从新打镜像,是把原来的包解压然后从新打包么 -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: Issues with Flink Batch and Hadoop dependency

2020-08-31 Thread Arvid Heise
Hi Dan, Your approach in general is good. You might want to use the bundled hadoop uber jar [1] to save some time if you find the appropriate version. You can also build your own version and include it then in lib/. In general, I'd recommend moving away from sequence files. As soon as you change

Re: Issues with Flink Batch and Hadoop dependency

2020-08-29 Thread Dan Hill
I was able to get a basic version to work by including a bunch of hadoop and s3 dependencies in the job jar and hacking in some hadoop config values. It's probably not optimal but it looks like I'm unblocked. On Fri, Aug 28, 2020 at 12:11 PM Dan Hill wrote: > I'm assuming I have a sim

Issues with Flink Batch and Hadoop dependency

2020-08-28 Thread Dan Hill
these Sequence files, I get the following error: NoClassDefFoundError: org/apache/hadoop/mapred/FileInputFormat It fails on this readSequenceFile. env.createInput(HadoopInputs.readSequenceFile(Text.class, ByteWritable.class, INPUT_FILE)) If I directly depend on org-apache-hadoop/hadoop-mapred when

Re: Re: hive-exec依赖导致hadoop冲突问题

2020-08-24 Thread amen...@163.com
好的谢谢回复, 在指定hive版本为2.1.1时,我选择了在程序中导入hive-exec-2.1.1、flink-connector-hive_2.11-1.11.1依赖,可正常操作hive table; best, amenhub 发件人: Rui Li 发送时间: 2020-08-24 21:33 收件人: user-zh 主题: Re: hive-exec依赖导致hadoop冲突问题 Hi, hive-exec本身并不包含Hadoop,如果是因为maven的传递依赖引入的话可以在打包时去掉。运行时使用的Hadoop版本可以用你集群Hadoop版本,而不是hive本身依

Re: hive-exec依赖导致hadoop冲突问题

2020-08-24 Thread Rui Li
Hi, hive-exec本身并不包含Hadoop,如果是因为maven的传递依赖引入的话可以在打包时去掉。运行时使用的Hadoop版本可以用你集群Hadoop版本,而不是hive本身依赖的Hadoop版本。另外对于Flink 1.11也可以考虑使用官方提供的flink-sql-connector-hive Uber jar,这个jar包含所有hive的依赖(Hadoop的依赖还是需要另外添加)。更详细的信息建议参考文档 [1][2]。 [1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table

回复: hive-exec依赖导致hadoop冲突问题

2020-08-24 Thread amen...@163.com
补充一下,当我移除hive-exec等程序中的hadoop依赖时,任务依旧异常,所以也许是我哪个地方没有到位,觉得依赖冲突是因为在测试hive集成之前,我提交过到yarn执行并无异常,所以排查思路来到了hive这里, 现在看来,可能是另外某个原因导致的,贴一点点异常栈如下: Caused by: org.apache.flink.client.deployment.ClusterDeploymentException: Could not deploy Yarn job cluster

  1   2   3   4   >