Hi fengqi,
“Hadoop is not in the
classpath/dependencies.”报错说明org.apache.hadoop.conf.Configuration和org.apache.hadoop.fs.FileSystem这些hdfs所需的类没有找到。
如果你的系统环境中有hadoop的话,通常是用这种方式来设置classpath:
export HADOOP_CLASSPATH=`hadoop classpath`
如果你的提交方式是提交到本地一个standalone的flink集群的话,可以检查下flink生成的日志文件,里面会打印
I could share some metrics about Alibaba Cloud EMR clusters.
The ratio of Hadoop2 VS Hadoop3 is 1:3.
Best,
Yang
On Thu, Dec 28, 2023 at 8:16 PM Martijn Visser
wrote:
> Hi all,
>
> I want to get some insights on how many users are still using Hadoop 2
> vs how many users are us
Hi all,
I want to get some insights on how many users are still using Hadoop 2
vs how many users are using Hadoop 3. Flink currently requires a
minimum version of Hadoop 2.10.2 for certain features, but also
extensively uses Hadoop 3 (like for the file system implementations)
Hadoop 2 has
Hi Mengxi Wang,
Which Flink version are you using?
Best regards,
Martijn
On Thu, Jul 13, 2023 at 3:21 PM Wang, Mengxi X via user <
user@flink.apache.org> wrote:
> Hi community,
>
>
>
> We got this kuerberos error with Hadoop as file system on ECS Fargate
> de
Hi community,
We got this kuerberos error with Hadoop as file system on ECS Fargate
deployment.
Caused by: org.apache.hadoop.security.KerberosAuthException: failure to login:
javax.security.auth.login.LoginException: java.lang.NullPointerException:
invalid null input: name
Caused
Currently, exporting the env "HADOOP_CONF_DIR" could only work for native
K8s integration. The flink client will try to create the
hadoop-config-volume automatically if hadoop env found.
If you want to set the HADOOP_CONF_DIR in the docker image, please also
make sure the specified h
Hi, community:
I'm using flink-k8s-operator v1.2.0 to deploy flink job. And the
"HADOOP_CONF_DIR" environment variable was setted in the image that i buiilded
from flink:1.15. I found the taskmanager pod was trying to mount a volume
named "hadoop-config-volume&
Hi all,
I’m converting several batch Flink workflows to streaming, with bounded sources.
Some of our sources are reading Hadoop sequence files via
StreamExecutionEnvironment.createInput(HadoopInputFormat).
The problem is that StreamGraphGenerator.existsUnboundedSource is returning
true
loaded over statefun's
>> protobuf-java 3.7.1, and NoSuchMethod exceptions occur.
>> >
>> > We hacked together a version of statefun that doesn't perform the check
>> whether the classloader settings contain the three patterns from above, and
>> as long as our job u
uf-java 3.7.1, and NoSuchMethod exceptions occur.
> >
> > We hacked together a version of statefun that doesn't perform the check
> whether the classloader settings contain the three patterns from above, and
> as long as our job uses protobouf-java 3.7.1 and the com.google.protob
and the com.google.protobuf
> pattern is not present in the classloader.parent-first-patterns.additional
> setting, then all is well.
>
> Aside from removing old hadoop from the classpath, which may not be possible
> given that it's a shared cluster, is there anything we can do o
patterns from above, and
as long as our job uses protobouf-java 3.7.1 and the
com.google.protobuf pattern
is not present in the classloader.parent-first-patterns.additional setting,
then all is well.
Aside from removing old hadoop from the classpath, which may not be
possible given that it's
As there were no strong objections, we'll proceed with bumping the Hadoop
version to 2.8.5 and removing the safeguards and the CI for any earlier
versions. This will effectively make the Hadoop 2.8.5 the least supported
version in Flink 1.15.
Best,
D.
On Thu, Dec 23, 2021 at 11:03 AM Till
If there are no users strongly objecting to dropping Hadoop support for <
2.8, then I am +1 for this since otherwise we won't gain a lot as Xintong
said.
Cheers,
Till
On Wed, Dec 22, 2021 at 10:33 AM David Morávek wrote:
> Agreed, if we drop the CI for lower versions, there is ac
Agreed, if we drop the CI for lower versions, there is actually no point of
having safeguards as we can't really test for them.
Maybe one more thought (it's more of a feeling), I feel that users running
really old Hadoop versions are usually slower to adopt (they most likely
use what the current
Sorry to join the discussion late.
+1 for dropping support for hadoop versions < 2.8 from my side.
TBH, warping the reflection based logic with safeguards sounds a bit
neither fish nor fowl to me. It weakens the major benefits that we look for
by dropping support for early versi
CC user@f.a.o
Is anyone aware of something that blocks us from doing the upgrade?
D.
On Tue, Dec 21, 2021 at 5:50 PM David Morávek
wrote:
> Hi Martijn,
>
> from person experience, most Hadoop users are lagging behind the release
> lines by a lot, because upgrading a Ha
he effort will be
>> tracked under the following ticket:
>>
>> https://issues.apache.org/jira/browse/FLINK-19589
>>
>> I will loop-in Arvid (in CC) which might help you in contributing the
>> missing functioniality.
>>
>> Regards,
>> Timo
>>
> Regards,
> Timo
>
>
> On 10.12.21 23:48, Timothy James wrote:
> > Hi,
> >
> > The Hadoop s3a library itself supports some properties we need, but the
> > "FileSystem SQL Connector" (via FileSystemTableFactory) does not pass
> > connec
23:48, Timothy James wrote:
Hi,
The Hadoop s3a library itself supports some properties we need, but the
"FileSystem SQL Connector" (via FileSystemTableFactory) does not pass
connector options for these to the "Hadoop/Presto S3 File Systems
plugins" (via S3FileSystemFacto
Hi,
The Hadoop s3a library itself supports some properties we need, but the
"FileSystem SQL Connector" (via FileSystemTableFactory) does not pass
connector options for these to the "Hadoop/Presto S3 File Systems plugins"
(via S3FileSystemFactory).
Instead, only Job-global
thing you could try is removing the packaged parquet format and
> defining a custom format[1]. For this custom format you can then fix the
> dependencies by packaging all of the following into the format:
>
> * flink-sql-parquet
> * flink-shaded-hadoop-2-uber
> * hadoop-aws
> *
Hi Natu,
Something you could try is removing the packaged parquet format and
defining a custom format[1]. For this custom format you can then fix the
dependencies by packaging all of the following into the format:
* flink-sql-parquet
* flink-shaded-hadoop-2-uber
* hadoop-aws
* aws-java-sdk
:
building the image with hadoop-client libraries) :
java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
at java.lang.Class.getDeclaredConstructors0(Native Method)
at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671)
at java.lang.Class.getDeclaredConstructors
Hi Tamir,
Thanks for providing the information. I don't know of a current solution
right now, perhaps some other user has an idea, but I do find your input
valuable for future improvements with regards to the S3 Client in Hadoop.
Best regards,
Martijn
On Fri, 19 Nov 2021 at 09:21, Tamir Sagi
hi,
环境:
1. flink-1.12,版本可以升级
2. flink-conf中配置了env.hadoop.conf.dir,路径下有hdfs集群的core-site.xml和hdfs-site.xml,
state.backend保存在该HDFS上
3. flink的部署模式是K8S+session
需求:
需要从一个s3协议的分布式文件系统中读取文件,处理完写到mysql中
问题:
s3配置采用hadoop的配置方式,保存为一个新的core-site.xml文件,参考的
https://hadoop.apache.org/docs/stable/hadoop
Hey Martijn,
sorry for late respond.
We wanted to replace the default client with our custom S3 client and not use
the AmazonS3Client provided by the plugin.
We used Flink-s3-fs-hadoop v1.12.2 and for our needs we had to upgrade to
v1.14.0 [1].
AmazonS3 client factory is initialized[2
Hi,
Could you elaborate on why you would like to replace the S3 client?
Best regards,
Martijn
On Wed, 13 Oct 2021 at 17:18, Tamir Sagi
wrote:
> I found the dependency
>
>
> org.apache.hadoop
> hadoop-aws
> 3.3.1
>
>
> apparently its
I found the dependency
org.apache.hadoop
hadoop-aws
3.3.1
apparently its possible, there is a method
setAmazonS3Client
I think I found the solution.
Thanks.
Tamir.
From: Tamir Sagi
Sent: Wednesday, October 13, 2021 5:44 PM
To: user
Hey community.
I would like to know if there is any way to replace the S3 client in Hadoop
plugin[1] to a custom client(AmazonS3).
I did notice that Hadoop plugin supports replacing the implementation of
S3AFileSystem using
"fs.s3a.impl" (in flink-conf.yaml it will b
Hi!
It seems that your Flink cluster cannot connect to
realtime-cluster-master001/xx.xx.xx.xx:8050. Please check your network and
port status.
Jim Chen 于2021年8月27日周五 下午2:20写道:
> Hi, All
> My flink version is 1.13.1 and my company have two hadoop cluster,
> offline hadoop cluster and
Hi, All
My flink version is 1.13.1 and my company have two hadoop cluster,
offline hadoop cluster and realtime hadoop cluster. Now, on realtime hadoop
cluster, we want to submit flink job to connect offline hadoop cluster by
different hive catalog. I use different hive configuration diretory
ng on. Until then, a
> workaround could be to add Hadoop manually and set the HADOOP_CLASSPATH
> environment variable. The root cause seems that Hadoop cannot be found.
>
> Alternatively, you could also build a custom image and include Hadoop in
> the lib folder of Flink:
>
> https:/
Thanks, this should definitely work with the pre-packaged connectors of
Ververica platform.
I guess we have to investigate what is going on. Until then, a
workaround could be to add Hadoop manually and set the HADOOP_CLASSPATH
environment variable. The root cause seems that Hadoop cannot
No custom code all through Flink SQL on UI no jars.
> >
> > Thanks,
> > Natu
> >
> > On Thu, Jul 22, 2021 at 2:08 PM Timo Walther > <mailto:twal...@apache.org>> wrote:
> >
> > Hi Natu,
> >
> > Ververica Platform 2.5 has updated
1.13.1 as the supported
flink version. No custom code all through Flink SQL on UI no jars.
Thanks,
Natu
On Thu, Jul 22, 2021 at 2:08 PM Timo Walther <mailto:twal...@apache.org>> wrote:
Hi Natu,
Ververica Platform 2.5 has updated the bundled Hadoop version but this
should n
Walther wrote:
> Hi Natu,
>
> Ververica Platform 2.5 has updated the bundled Hadoop version but this
> should not result in a NoClassDefFoundError exception. How are you
> submitting your SQL jobs? You don't use Ververica's SQL service but have
> built a regular JAR file, right? I
Hi Natu,
Ververica Platform 2.5 has updated the bundled Hadoop version but this
should not result in a NoClassDefFoundError exception. How are you
submitting your SQL jobs? You don't use Ververica's SQL service but have
built a regular JAR file, right? If this is the case, can you share your
to FAILED on 10.243.3.0:42337-2a3224 @
10-243-3-0.flink-metrics.vvp-jobs.svc.cluster.local (dataPort=39309).
java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
at java.lang.Class.getDeclaredConstructors0(Native Method)
~[?:1.8.0_292
.
Could this be a LocalStreamEnvironment limitation? Is there any way to enable
plugin loading locally?
Thanks!
On 2021/06/21 11:13:29, Yuval Itzchakov wrote:
> Currently I have the s3-hadoop dependency in my build.sbt.
>
> I guess I need to move it to the PLUGIN directory locall
你好,我也遇到了这个问题,请问下你具体是如何打镜像的呢?
我是Dockerfile里添加
COPY --chown=flink:flink jars/flink-shaded-hadoop-2-2.8.3-10.0.jar
$FLINK_HOME/lib/
但是运行flink 1.11 on k8s的session cluster,
jobserver能启动,但是提交job后报错。说不能初始化HadoopUtils,
但是flink-shaded-hadoop-2-2.8.3-10.0.jar里的确是有HadoopUtils这个class的。
Caused
Hi community,
This question is cross-posted on Stack Overflow
https://stackoverflow.com/questions/67264156/flink-hive-connector-hive-conf-dir-supports-hdfs-uri-while-hadoop-conf-dir-sup
In my current setup, local dev env can access testing env. I would like to
run Flink job on local dev env
ptions. It is only used to construct the classpath for the JM/TM process.
>> However, in "HadoopUtils"[2] we do not support getting the hadoop
>> configuration from classpath.
>>
>>
>> [1].
>> https://github.com/apache/flink/blob/release-1.11/flink-dist/sr
at 4:52 AM Yang Wang wrote:
>>
>>> It seems that we do not export HADOOP_CONF_DIR as environment variables
>>> in current implementation, even though we have set the env.xxx flink config
>>> options. It is only used to construct the classpath for the JM/TM process.
&
It seems that we do not export HADOOP_CONF_DIR as environment variables in
current implementation, even though we have set the env.xxx flink config
options. It is only used to construct the classpath for the JM/TM process.
However, in "HadoopUtils"[2] we do not support getting
Hi Robert,
indeed my docker-compose does work only if I add also Hadoop and yarn home
while I was expecting that those two variables were generated automatically
just setting env.xxx variables in FLINK_PROPERTIES variable..
I just want to understand what to expect, if I really need to specify
Hi,
I'm not aware of any known issues with Hadoop and Flink on Docker.
I also tried what you are doing locally, and it seems to work:
flink-jobmanager| 2021-04-15 18:37:48,300 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint[] - Starting
StandaloneSessionClusterEntrypoint
Hi everybody,
I'm trying to set up reading from HDFS using docker-compose and Flink
1.11.3.
If I pass 'env.hadoop.conf.dir' and 'env.yarn.conf.dir'
using FLINK_PROPERTIES (under environment section of the docker-compose
service) I see in the logs the following line:
"Could not find H
This looks related to HDFS-12920; where Hadoop 2.X tries to read a
duration from hdfs-default.xml expecting plain numbers, but in 3.x they
also contain time units.
On 3/30/2021 9:37 AM, Matthias Seiler wrote:
Thank you all for the replies!
I did as @Maminspapin suggested and indeed
t;30s"
// this is thrown by the flink-shaded-hadoop library
```
I thought that it relates to the windowing I do, which has a slide
interval of 30 seconds, but removing it displays the same error.
I also added the dependency to the maven pom, but without effect.
Since I use Hadoop 3.2.1, I also t
Hey Matthias,
Maybe the classpath contains hadoop libraries, but not the HDFS libraries?
The "DistributedFileSystem" class needs to be accessible to the
classloader. Can you check if that class is available?
Best,
Robert
On Thu, Mar 25, 2021 at 11:10 AM Matthias Seiler <
hi all
onyarn31??flink-confhadoop yarn
https://issues.apache.org/jira/browse/FLINK-21981
2??hadoop#configuration??yarnyarn??configuration??yarn??configurationconfiguration??
https
I downloaded the lib (last version) from here:
https://repo.maven.apache.org/maven2/org/apache/flink/flink-shaded-hadoop-2-uber/2.8.3-7.0/
and put it in the flink_home/lib directory.
It helped.
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
I have the same problem ...
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hello everybody,
I set up a a Flink (1.12.1) and Hadoop (3.2.1) cluster on two machines.
The job should store the checkpoints on HDFS like so:
```java
StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment();
env.enableCheckpointing(15000
Flink job manager HA 是否可以像 Hadoop Name Node 一样手动重启,同时保证集群正常运行?
我发现 job manager 占用内存似乎总是在缓慢不断增长,Hadoop Name Node 也有这个问题,我通过隔一段时间轮动重启Hadoop
Name Node 解决这个问题,在HA模式下Flink job manager 是否可以轮动重启?
--
Sent from: http://apache-flink.147419.n8.nabble.com/
Thanks a lot for reporting this problem Debraj. I've created a JIRA issue
for it [1].
[1] https://issues.apache.org/jira/browse/FLINK-21723
Cheers,
Till
On Tue, Mar 9, 2021 at 5:28 AM Debraj Manna
wrote:
> Hi
>
> It appears the Hadoop Interation
> <https://ci.apache.org/proje
Hi
It appears the Hadoop Interation
<https://ci.apache.org/projects/flink/flink-docs-release-1.12/ops/deployment/hadoop.html>
link is broken on downloads <https://flink.apache.org/downloads.html> page.
Apache Flink® 1.12.2 is our latest stable release.
> If you plan to use Apache
Hi all,
during a security maintenance PR [1], Chesnay noticed that the
flink-swift-fs-hadoop module is lacking test coverage [2].
Also, there hasn't been any substantial change since 2018, when it was
introduced.
On the user@ ML, I could not find any proof of significant use of the
module (no one
Hi all,
during a security maintenance PR [1], Chesnay noticed that the
flink-swift-fs-hadoop module is lacking test coverage [2].
Also, there hasn't been any substantial change since 2018, when it was
introduced.
On the user@ ML, I could not find any proof of significant use of the
module (no one
t; >> @Michael Ran; 嗯嗯,没关系。
>> >>
>> >> @张锴 你说的是flink哪个版本的connector,stream or sql。我搜了下我的没有。我是1.12,stream。
>> >>
>> >>
>> 目前看文档有streamFileSink,还有FileSink,从文档内容来看使用方式差不多。我计划试一下FileSink,但不清楚FileSink和StreamFileSink啥区别,是否都能写hadoop类文件系统,因为涉及是否原子写
gt;> @Michael Ran; 嗯嗯,没关系。
> >>
> >> @张锴 你说的是flink哪个版本的connector,stream or sql。我搜了下我的没有。我是1.12,stream。
> >>
> >>
> 目前看文档有streamFileSink,还有FileSink,从文档内容来看使用方式差不多。我计划试一下FileSink,但不清楚FileSink和StreamFileSink啥区别,是否都能写hadoop类文件系统,因为涉及是否原子写,比较分布式文件系统不支持追加和编辑等。
> &
>>
>> 目前看文档有streamFileSink,还有FileSink,从文档内容来看使用方式差不多。我计划试一下FileSink,但不清楚FileSink和StreamFileSink啥区别,是否都能写hadoop类文件系统,因为涉及是否原子写,比较分布式文件系统不支持追加和编辑等。
>>
>> Michael Ran 于2021年1月21日周四 下午7:01写道:
>>
>> >
>> >
>> 很抱歉,我已经很久没用过这个了。但是可以根据异常信息以及API源码执行进行分析,确定是
我用的flink 1.10版,FlieSink就是BucketingSink,我是用这个写hdfs的
赵一旦 于2021年1月21日周四 下午7:05写道:
> @Michael Ran; 嗯嗯,没关系。
>
> @张锴 你说的是flink哪个版本的connector,stream or sql。我搜了下我的没有。我是1.12,stream。
>
> 目前看文档有streamFileSink,还有FileSink,从文档内容来看使用方式差不多。我计划试一下FileSink,但不清楚FileSink和StreamFileSink啥区别,是否都能写hadoo
@Michael Ran; 嗯嗯,没关系。
@张锴 你说的是flink哪个版本的connector,stream or sql。我搜了下我的没有。我是1.12,stream。
目前看文档有streamFileSink,还有FileSink,从文档内容来看使用方式差不多。我计划试一下FileSink,但不清楚FileSink和StreamFileSink啥区别,是否都能写hadoop类文件系统,因为涉及是否原子写,比较分布式文件系统不支持追加和编辑等。
Michael Ran 于2021年1月21日周四 下午7:01写道:
>
> 很抱歉,我已经很久没用过这个了。但是可以
Writer(org.apache.hadoop.fs.FileSystem fs) {...}
>> > 在 2021-01-21 17:18:23,"赵一旦" 写道:
>> > >具体报错信息如下:
>> > >
>> > >java.lang.UnsupportedOperationException: Recoverable writers on Hadoop
>&g
> 这里应该是用了hdfs 的特定API吧,文件系统没兼容public
> > HadoopRecoverableWriter(org.apache.hadoop.fs.FileSystem fs) {...}
> > 在 2021-01-21 17:18:23,"赵一旦" 写道:
> > >具体报错信息如下:
> > >
> > >java.lang.UnsupportedOperationExcep
t; 写道:
> >具体报错信息如下:
> >
> >java.lang.UnsupportedOperationException: Recoverable writers on Hadoop are
> >only supported for HDFS
> >at org.apache.flink.runtime.fs.hdfs.HadoopRecoverableWriter.(
> >HadoopRecoverableWriter.java:61)
> >at org.apache.flink.runtime.fs.hdfs.HadoopFil
这里应该是用了hdfs 的特定API吧,文件系统没兼容public
HadoopRecoverableWriter(org.apache.hadoop.fs.FileSystem fs) {...}
在 2021-01-21 17:18:23,"赵一旦" 写道:
>具体报错信息如下:
>
>java.lang.UnsupportedOperationException: Recoverable writers on Hadoop are
>o
除此以外,FlinkSQL读现有的hive数据仓库也是失败。配置okhive的catalog,表信息都能出来,但select操作就是失败。
赵一旦 于2021年1月21日周四 下午5:18写道:
> 具体报错信息如下:
>
> java.lang.UnsupportedOperationException: Recoverable writers on Hadoop
> are only supported for HDFS
> at org.apache.flink.runtime.fs.hdfs.HadoopRe
具体报错信息如下:
java.lang.UnsupportedOperationException: Recoverable writers on Hadoop are
only supported for HDFS
at org.apache.flink.runtime.fs.hdfs.HadoopRecoverableWriter.(
HadoopRecoverableWriter.java:61)
at org.apache.flink.runtime.fs.hdfs.HadoopFileSystem
.createRecoverableWriter
Recoverable writers on Hadoop are only supported for HDFS
如上,我们用的hadoop协议的,但底层不是hdfs,是公司自研的分布式文件系统。
使用spark写,spark-sql读等都没问题。但是flink写和读当前都没尝试成功。
K8SHA??HDFS
??2020??12??22?? 13:43??liujian ??
Thanks,flink-confhistory
server,??hdfs??,??web ui??,
----
??:
Thanks,flink-confhistory
server,??hdfs??,??web ui??,
----
??:
> COPY ./jar/flink-shaded-hadoop-2-uber-2.8.3-10.0.jar
> /opt/flink/lib/flink-shaded-hadoop-2-uber-2.8.3-10.0.jar
> ENTRYPOINT ["/docker-entrypoint.sh"]
> EXPOSE 6123 8081 8082
> CMD ["help","history-server"]
>
> ------
Thanks, docker??,??Native
K8s??,??
Dockerfile??
COPY ./jar/flink-shaded-hadoop-2-uber-2.8.3-10.0.jar
/opt/flink/lib/flink-shaded-hadoop-2-uber-2.8.3-10.0.jar
ENTRYPOINT ["/docker-entrypoint.sh"]
EXPOSE
"user-zh"
> <
> danrtsey...@gmail.com;
> 发送时间:2020年12月21日(星期一) 上午10:15
> 收件人:"user-zh"
> 主题:Re: flink-shaded-hadoop-2-uber版本如何选择
>
>
>
>
> 你不需要修改CMD,entrypoint默认是docker-entrypoint.sh[1],是支
??history-server,,,??,
----
??:
<
> danrtsey...@gmail.com;
> 发送时间:2020年12月19日(星期六) 晚上9:35
> 收件人:"user-zh"
> 主题:Re: flink-shaded-hadoop-2-uber版本如何选择
>
>
>
> 你只需要在Flink Client端设置HADOOP_CONF_DIR的环境就可以了
> Flink
>
> Client会自动把hdfs-site.xml、core-site.xml文件通过创建一个单独ConfigMap,然
Thanks,
??historyServer,flink
1.12.0??Dockerfile ??CMD ["history-server"]
??8082,??
----
??:
你只需要在Flink Client端设置HADOOP_CONF_DIR的环境就可以了
Flink
Client会自动把hdfs-site.xml、core-site.xml文件通过创建一个单独ConfigMap,然后挂载给JobManager和TaskManager的
同时这两个配置也会自动加载到classpath下,只需要lib下放了flink-shaded-hadoop,就不需要做其他事情,可以直接访问hdfs的
Best,
Yang
liujian <13597820...@qq.com> 于2020年12月19日周六 下午8:29写道:
>
> HD
HDFS??Ha,hdfs-site.xml,,configMap??hdfs-site.xml??$FLINK_HOME/conf??
----
??:
"user-zh"
如果是在K8s上面访问hdfs,还是需要把flink-shaded-hadoop放到lib目录下,因为目前hadoop的FileSystem并不支持plugin加载
Best,
Yang
superainbower 于2020年12月16日周三 下午6:19写道:
> 借楼请问下,部署到K8S上怎么访问HDFS呢,目前我还是把shaded的jar打到镜像里面去
> 在2020年12月16日 10:53,Yang Wang 写道:
>
> 以flink-shaded-hadoop-2-uber的2.8.3-10.0为例
>
> 2.8.3指
借楼请问下,部署到K8S上怎么访问HDFS呢,目前我还是把shaded的jar打到镜像里面去
在2020年12月16日 10:53,Yang Wang 写道:
以flink-shaded-hadoop-2-uber的2.8.3-10.0为例
2.8.3指的hadoop的版本,10.0指定的flink-shaded[1]的版本
社区从1.10开始不再推荐使用flink-shaded-hadoop的方式,而且通过设置HADOOP_CLASSPATH环境变量来提交[2],
这样可以让Flink变得hadoop free,从而同时支持hadoop2和hadoop3
如果你还坚持使用
以flink-shaded-hadoop-2-uber的2.8.3-10.0为例
2.8.3指的hadoop的版本,10.0指定的flink-shaded[1]的版本
社区从1.10开始不再推荐使用flink-shaded-hadoop的方式,而且通过设置HADOOP_CLASSPATH环境变量来提交[2],
这样可以让Flink变得hadoop free,从而同时支持hadoop2和hadoop3
如果你还坚持使用flink-shaded-hadoop,那就建议使用最新的版本就可以了2.8.3-10.0
[1]. https://github.com/apache/flink
你得确认hadoop classpath返回的是完整的,正常情况下hadoop classpath这个命令会把所有的hadoop jar都包含进去的
如果报类或者方法不存在需要确认相应的jar是否存在,并且包含进去了
社区推荐hadoop classpath的方式主要是想让Flink做到hadoop free,这样在hadoop2和hadoop3都可以正常运行了
Best,
Yang
Jacob <17691150...@163.com> 于2020年12月15日周二 上午9:25写道:
> 谢谢回复!
>
> 这个文档我也有查看
>
>
谢谢回复!
这个文档我也有查看
前几日在flink1.9-1.12各个客户端测试提交job时候发现
对于1.10+的版本,我手动导入export HADOOP_CLASSPATH=`hadoop
classpath`,没有效果,各种报错,基本都是Hadoop相关类、方法不存在(NoSuchMethod之类错误),把pom文件改来改去依然无用,后来只在pom文件中导入依赖:flink-shaded-hadoop-2-uber*-*,竟然可以正常提交并运行job了。
--
Sent from: http://apache-flink.147419.n8.nabble.com/
flink已经不建议将hadoop的jar放到lib里了
可以通过
export HADOOP_CLASSPATH=`hadoop classpath`
加载hadoop的依赖
参考链接:
https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/hadoop.html#providing-hadoop-classes
--
Sent from: http://apache-flink.147419.n8.nabble.com/
请问在升级flink版本的过程中,需要在flink/lib里面引入该包,但该包的版本号如何确定?
flink-shaded-hadoop-2-uber*-*
--
Sent from: http://apache-flink.147419.n8.nabble.com/
flink-shaded-hadoop-2-uber版本如何选择?
xxx-xxx 分别表示什么意思?
hi、可以去hadoop的一个节点直接打镜像哈,打镜像的时候把需要的hadoop依赖包、flink一起打包到docker里面,然后配置一下环境变量就可以用了;如果你的docker部署节点有hadoop或flink也可以直接外挂;目前我们使用的是第一种
Yang Wang 于2020年10月12日周一 上午10:23写道:
> 只需要base社区的镜像,然后再加上一层(拷贝flink-shaded-hadoop),commit到docker
> image,然后push到docker registry就可以了
>
> 例如Dockerfile可以如下
只需要base社区的镜像,然后再加上一层(拷贝flink-shaded-hadoop),commit到docker
image,然后push到docker registry就可以了
例如Dockerfile可以如下
FROM flink:1.11.1-scala_2.11
COPY flink-shaded-hadoop-2*.jar /opt/flink/lib/
另外,flink-shaded-hadoop可以从这里下载[1]
[1].
https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-hadoop
麻烦问一下,您是怎么从新打镜像的,是把原来的jar解压出来,然后在打包么?
--
Sent from: http://apache-flink.147419.n8.nabble.com/
麻烦问一下,从新打镜像,是把原来的包解压然后从新打包么
--
Sent from: http://apache-flink.147419.n8.nabble.com/
Hi Dan,
Your approach in general is good. You might want to use the bundled hadoop
uber jar [1] to save some time if you find the appropriate version. You can
also build your own version and include it then in lib/.
In general, I'd recommend moving away from sequence files. As soon as you
change
I was able to get a basic version to work by including a bunch of hadoop
and s3 dependencies in the job jar and hacking in some hadoop config
values. It's probably not optimal but it looks like I'm unblocked.
On Fri, Aug 28, 2020 at 12:11 PM Dan Hill wrote:
> I'm assuming I have a sim
these Sequence files, I get the
following error:
NoClassDefFoundError: org/apache/hadoop/mapred/FileInputFormat
It fails on this readSequenceFile.
env.createInput(HadoopInputs.readSequenceFile(Text.class,
ByteWritable.class, INPUT_FILE))
If I directly depend on org-apache-hadoop/hadoop-mapred when
好的谢谢回复,
在指定hive版本为2.1.1时,我选择了在程序中导入hive-exec-2.1.1、flink-connector-hive_2.11-1.11.1依赖,可正常操作hive
table;
best,
amenhub
发件人: Rui Li
发送时间: 2020-08-24 21:33
收件人: user-zh
主题: Re: hive-exec依赖导致hadoop冲突问题
Hi,
hive-exec本身并不包含Hadoop,如果是因为maven的传递依赖引入的话可以在打包时去掉。运行时使用的Hadoop版本可以用你集群Hadoop版本,而不是hive本身依
Hi,
hive-exec本身并不包含Hadoop,如果是因为maven的传递依赖引入的话可以在打包时去掉。运行时使用的Hadoop版本可以用你集群Hadoop版本,而不是hive本身依赖的Hadoop版本。另外对于Flink
1.11也可以考虑使用官方提供的flink-sql-connector-hive Uber
jar,这个jar包含所有hive的依赖(Hadoop的依赖还是需要另外添加)。更详细的信息建议参考文档 [1][2]。
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/table
补充一下,当我移除hive-exec等程序中的hadoop依赖时,任务依旧异常,所以也许是我哪个地方没有到位,觉得依赖冲突是因为在测试hive集成之前,我提交过到yarn执行并无异常,所以排查思路来到了hive这里,
现在看来,可能是另外某个原因导致的,贴一点点异常栈如下:
Caused by: org.apache.flink.client.deployment.ClusterDeploymentException: Could
not deploy Yarn job cluster
1 - 100 of 386 matches
Mail list logo