This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 2b76ef43afb9 [SPARK-47683][PYTHON][BUILD][FOLLOW-UP] Exclude 
`lib/py4j*zip` in `pyspark-connect` package
2b76ef43afb9 is described below

commit 2b76ef43afb9fdf0f9028e4cb7d23fceb1d0015f
Author: Hyukjin Kwon <gurwls...@apache.org>
AuthorDate: Thu May 2 12:22:42 2024 +0900

    [SPARK-47683][PYTHON][BUILD][FOLLOW-UP] Exclude `lib/py4j*zip` in 
`pyspark-connect` package
    
    ### What changes were proposed in this pull request?
    
    This PR is a followup of https://github.com/apache/spark/pull/45053 that 
includes `lib/py4j*zip` in the package. Currently it's being picked up by 
https://github.com/apache/spark/blob/master/python/MANIFEST.in#L26. For other 
files, we don't create `deps` directory in `setup.py` for `pyspark-connect` so 
they are not included. But `lib` is being included.
    
    ### Why are the changes needed?
    
    To exclude unrelated files.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No, the main change has not been released out yet.
    
    ### How was this patch tested?
    
    Manually packaged, and checked the contents via `vi`.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #46331 from HyukjinKwon/SPARK-47683-followup.
    
    Authored-by: Hyukjin Kwon <gurwls...@apache.org>
    Signed-off-by: Hyukjin Kwon <gurwls...@apache.org>
---
 python/packaging/classic/setup.py |  5 +++++
 python/packaging/connect/setup.py | 10 +++++++++-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/python/packaging/classic/setup.py 
b/python/packaging/classic/setup.py
index 8478a60d633b..5e94c2b65380 100755
--- a/python/packaging/classic/setup.py
+++ b/python/packaging/classic/setup.py
@@ -204,8 +204,13 @@ try:
     copyfile("pyspark/shell.py", "pyspark/python/pyspark/shell.py")
 
     if in_spark:
+        # !!HACK ALTERT!!
+        # `setup.py` has to be located with the same directory with the 
package.
+        # Therefore, we copy the current file, and place it at `spark/python` 
directory.
+        # After that, we remove it in the end.
         copyfile("packaging/classic/setup.py", "setup.py")
         copyfile("packaging/classic/setup.cfg", "setup.cfg")
+
         # Construct the symlink farm - this is nein_sparkcessary since we 
can't refer to
         # the path above the package root and we need to copy the jars and 
scripts which
         # are up above the python root.
diff --git a/python/packaging/connect/setup.py 
b/python/packaging/connect/setup.py
index 20133dc9eb11..bc1d4fd2868d 100755
--- a/python/packaging/connect/setup.py
+++ b/python/packaging/connect/setup.py
@@ -25,7 +25,7 @@
 import sys
 from setuptools import setup
 import os
-from shutil import copyfile
+from shutil import copyfile, move
 import glob
 from pathlib import Path
 
@@ -109,6 +109,13 @@ if "SPARK_TESTING" in os.environ:
 
 try:
     if in_spark:
+        # !!HACK ALTERT!!
+        # 1. `setup.py` has to be located with the same directory with the 
package.
+        #    Therefore, we copy the current file, and place it at 
`spark/python` directory.
+        #    After that, we remove it in the end.
+        # 2. Here it renames `lib` to `lib.ack` so MANIFEST.in does not pick 
`py4j` up.
+        #    We rename it back in the end.
+        move("lib", "lib.back")
         copyfile("packaging/connect/setup.py", "setup.py")
         copyfile("packaging/connect/setup.cfg", "setup.cfg")
 
@@ -207,5 +214,6 @@ try:
     )
 finally:
     if in_spark:
+        move("lib.back", "lib")
         os.remove("setup.py")
         os.remove("setup.cfg")


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to