It basically sounds good.

We already know that package dependencies of Bigtop is not exhaustive.
While many products depend on Java, there is no package dependency on Java.
Users can choose JDK distribution based on their preferences.
Maybe we can provide utilities like bigtop-detect-javahome of bigtop-utils
as a follow-up work.

Since the change affects all platforms, fixing issues like BIGTOP-3978 and
BIGTOP-3979 first may make the work easier in testing perspective.

On 2023/12/18 16:04, Jialiang Cai wrote:
My apologies, I didn't clarify earlier. I don't want to remove Python 3. What I 
mean is to remove the 'require: python' dependency in the Spark spec and 
control. This way, installing Spark won't require a Python dependency. If users 
need to use PySpark, they can manually install the corresponding Python version 
using Conda.

Additionally, there are many extra installations in the Bigtop code for 
managing Python 2. As far as I know, all components now support Python 3, and 
Python 2 has been deprecated for a long time. Bigtop just hasn't done the 
Python 3 upgrade work yet.

This is because it involves the Python version in Spark 3 packaging, GPDB 
Python, Ranger Python, and Phoenix Python dependencies, but these issues can be 
resolved.

Ambari used to strongly depend on Python 2, but Ambari has been dropped from 
Bigtop. None of the other components have a strong dependency on Python 2. In 
Spark, PySpark can be managed separately by users, so specifying a Python 3 
version in the packaging isn't a good choice.
GPDB 6 officially supports Python 3.
While Ranger doesn't necessarily require Python for installation, although it 
has some Python 2 scripts, they are used relatively sparingly.
So, one of the goals of this discussion is to remove Python as a dependency for 
Spark installation and to facilitate the future upgrade of Python 2 to Python 3 
in Bigtop.

On Dec 18, 2023, at 14:50, 李帅 <lishuaipeng...@gmail.com> wrote:

python3 has a lot of compatibility issues, different linux distro have
different python3 versions.

Jialiang Cai <jialiangca...@gmail.com> 于 2023年12月18日周一 09:46写道:

Dear Community Members,

I would like to initiate a discussion regarding the removal of Python from
the Spark3 installation package. Here are a few reasons for considering
this change:

1.Unlike Apache Ambari, which installs components individually, Spark3's
core functionality does not depend on Python3. Therefore, it may not be
appropriate to make Python3 a mandatory installation dependency for Spark.
Spark itself can run without Python3, and users who do not intend to use
PySpark should still be able to install and use Spark without any issues.

2.The Python3 version required by PySpark is often relatively high, and
many operating systems do not provide such high Python versions by default.
Including PySpark's Python3 dependency in the Bigtop codebase would
introduce significant complexity. It might be more suitable for users to
manually install the specific Python3 version required by PySpark, perhaps
using Conda or other methods.

3.Removing Python3 dependency from Spark can also benefit the overall
transition of Bigtop from Python2 to Python3. Python2 has not been
maintained for a considerable period, and streamlining the codebase to work
with Python3 can be a step toward maintaining the project's relevance and
security.

I encourage everyone to share their thoughts and opinions on this matter.
Your feedback is valuable as we consider the best course of action.

Thank you for your participation and input.

Best regards,
jiaLiang

Reply via email to