Repository: spark
Updated Branches:
  refs/heads/master d74373225 -> 37a5e272f


[SPARK-4809] Rework Guava library shading.

The current way of shading Guava is a little problematic. Code that
depends on "spark-core" does not see the transitive dependency, yet
classes in "spark-core" actually depend on Guava. So it's a little
tricky to run unit tests that use spark-core classes, since you need
a compatible version of Guava in your dependencies when running the
tests. This can become a little tricky, and is kind of a bad user
experience.

This change modifies the way Guava is shaded so that it's applied
uniformly across the Spark build. This means Guava is shaded inside
spark-core itself, so that the dependency issues above are solved.
Aside from that, all Spark sub-modules have their Guava references
relocated, so that they refer to the relocated classes now packaged
inside spark-core. Before, this was only done by the time the assembly
was built, so projects that did not end up inside the assembly (such
as streaming backends) could still reference the original location
of Guava classes.

The Guava classes are added to the "first" artifact Spark generates
(network-common), so that all downstream modules have the needed
classes available. Since "network-common" is a dependency of spark-core,
all Spark apps should get the relocated classes automatically.

Author: Marcelo Vanzin <[email protected]>

Closes #3658 from vanzin/SPARK-4809 and squashes the following commits:

3c93e42 [Marcelo Vanzin] Shade Guava in the network-common artifact.
5d69ec9 [Marcelo Vanzin] Merge branch 'master' into SPARK-4809
b3104fc [Marcelo Vanzin] Add comment.
941848f [Marcelo Vanzin] Merge branch 'master' into SPARK-4809
f78c48a [Marcelo Vanzin] Merge branch 'master' into SPARK-4809
8053dd4 [Marcelo Vanzin] Merge branch 'master' into SPARK-4809
107d7da [Marcelo Vanzin] Add fix for SPARK-5052 (PR #3874).
40b8723 [Marcelo Vanzin] Merge branch 'master' into SPARK-4809
4a4ed42 [Marcelo Vanzin] [SPARK-4809] Rework Guava library shading.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/37a5e272
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/37a5e272
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/37a5e272

Branch: refs/heads/master
Commit: 37a5e272f898e946c09c2e7de5d1bda6f27a8f39
Parents: d743732
Author: Marcelo Vanzin <[email protected]>
Authored: Wed Jan 28 00:29:29 2015 -0800
Committer: Patrick Wendell <[email protected]>
Committed: Wed Jan 28 00:29:29 2015 -0800

----------------------------------------------------------------------
 assembly/pom.xml        |  22 ---------
 core/pom.xml            |  48 --------------------
 examples/pom.xml        | 103 ++++++++++++++-----------------------------
 network/common/pom.xml  |  24 +++++++---
 network/shuffle/pom.xml |   1 -
 pom.xml                 |  22 ++++++++-
 streaming/pom.xml       |   8 ++++
 7 files changed, 81 insertions(+), 147 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/37a5e272/assembly/pom.xml
----------------------------------------------------------------------
diff --git a/assembly/pom.xml b/assembly/pom.xml
index 594fa0c..1bb5a67 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -43,12 +43,6 @@
   </properties>
 
   <dependencies>
-    <!-- Promote Guava to compile scope in this module so it's included while 
shading. -->
-    <dependency>
-      <groupId>com.google.guava</groupId>
-      <artifactId>guava</artifactId>
-      <scope>compile</scope>
-    </dependency>
     <dependency>
       <groupId>org.apache.spark</groupId>
       <artifactId>spark-core_${scala.binary.version}</artifactId>
@@ -133,22 +127,6 @@
               <goal>shade</goal>
             </goals>
             <configuration>
-              <relocations>
-                <relocation>
-                  <pattern>com.google</pattern>
-                  <shadedPattern>org.spark-project.guava</shadedPattern>
-                  <includes>
-                    <include>com.google.common.**</include>
-                  </includes>
-                  <excludes>
-                    <exclude>com/google/common/base/Absent*</exclude>
-                    <exclude>com/google/common/base/Function</exclude>
-                    <exclude>com/google/common/base/Optional*</exclude>
-                    <exclude>com/google/common/base/Present*</exclude>
-                    <exclude>com/google/common/base/Supplier</exclude>
-                  </excludes>
-                </relocation>
-              </relocations>
               <transformers>
                 <transformer 
implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"
 />
                 <transformer 
implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">

http://git-wip-us.apache.org/repos/asf/spark/blob/37a5e272/core/pom.xml
----------------------------------------------------------------------
diff --git a/core/pom.xml b/core/pom.xml
index 1984682..3c51b2d 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -106,16 +106,6 @@
       <groupId>org.eclipse.jetty</groupId>
       <artifactId>jetty-server</artifactId>
     </dependency>
-    <!--
-      Promote Guava to "compile" so that maven-shade-plugin picks it up (for 
packaging the Optional
-      class exposed in the Java API). The plugin will then remove this 
dependency from the published
-      pom, so that Guava does not pollute the client's compilation classpath.
-    -->
-    <dependency>
-      <groupId>com.google.guava</groupId>
-      <artifactId>guava</artifactId>
-      <scope>compile</scope>
-    </dependency>
     <dependency>
       <groupId>org.apache.commons</groupId>
       <artifactId>commons-lang3</artifactId>
@@ -352,44 +342,6 @@
       </plugin>
       <plugin>
         <groupId>org.apache.maven.plugins</groupId>
-        <artifactId>maven-shade-plugin</artifactId>
-        <executions>
-          <execution>
-            <phase>package</phase>
-            <goals>
-              <goal>shade</goal>
-            </goals>
-            <configuration>
-              <shadedArtifactAttached>false</shadedArtifactAttached>
-              <artifactSet>
-                <includes>
-                  <include>com.google.guava:guava</include>
-                </includes>
-              </artifactSet>
-              <filters>
-                <!-- See comment in the guava dependency declaration above. -->
-                <filter>
-                  <artifact>com.google.guava:guava</artifact>
-                  <includes>
-                    <include>com/google/common/base/Absent*</include>
-                    <include>com/google/common/base/Function</include>
-                    <include>com/google/common/base/Optional*</include>
-                    <include>com/google/common/base/Present*</include>
-                    <include>com/google/common/base/Supplier</include>
-                  </includes>
-                </filter>
-              </filters>
-            </configuration>
-          </execution>
-        </executions>
-      </plugin>
-      <!--
-        Copy guava to the build directory. This is needed to make the 
SPARK_PREPEND_CLASSES
-        option work in compute-classpath.sh, since it would put the non-shaded 
Spark classes in
-        the runtime classpath.
-      -->
-      <plugin>
-        <groupId>org.apache.maven.plugins</groupId>
         <artifactId>maven-dependency-plugin</artifactId>
         <executions>
           <execution>

http://git-wip-us.apache.org/repos/asf/spark/blob/37a5e272/examples/pom.xml
----------------------------------------------------------------------
diff --git a/examples/pom.xml b/examples/pom.xml
index 4b92147..8caad2b 100644
--- a/examples/pom.xml
+++ b/examples/pom.xml
@@ -35,12 +35,6 @@
   <url>http://spark.apache.org/</url>
 
   <dependencies>
-    <!-- Promote Guava to compile scope in this module so it's included while 
shading. -->
-    <dependency>
-      <groupId>com.google.guava</groupId>
-      <artifactId>guava</artifactId>
-      <scope>compile</scope>
-    </dependency>
     <dependency>
       <groupId>org.apache.spark</groupId>
       <artifactId>spark-core_${scala.binary.version}</artifactId>
@@ -310,69 +304,40 @@
       <plugin>
         <groupId>org.apache.maven.plugins</groupId>
         <artifactId>maven-shade-plugin</artifactId>
-        <executions>
-          <execution>
-            <phase>package</phase>
-            <goals>
-              <goal>shade</goal>
-            </goals>
-            <configuration>
-            <shadedArtifactAttached>false</shadedArtifactAttached>
-            
<outputFile>${project.build.directory}/scala-${scala.binary.version}/spark-examples-${project.version}-hadoop${hadoop.version}.jar</outputFile>
-            <artifactSet>
-              <includes>
-                <include>*:*</include>
-              </includes>
-            </artifactSet>
-            <filters>
-              <filter>
-                <artifact>com.google.guava:guava</artifact>
-                <excludes>
-                  <!--
-                    Exclude all Guava classes so they're picked up from the 
main assembly. The
-                    dependency still needs to be compile-scoped so that the 
relocation below
-                    works.
-                  -->
-                  <exclude>**</exclude>
-                </excludes>
-              </filter>
-              <filter>
-                <artifact>*:*</artifact>
-                <excludes>
-                  <exclude>META-INF/*.SF</exclude>
-                  <exclude>META-INF/*.DSA</exclude>
-                  <exclude>META-INF/*.RSA</exclude>
-                </excludes>
-              </filter>
-            </filters>
-              <relocations>
-                <relocation>
-                  <pattern>com.google</pattern>
-                  <shadedPattern>org.spark-project.guava</shadedPattern>
-                  <includes>
-                    <include>com.google.common.**</include>
-                  </includes>
-                  <excludes>
-                    <exclude>com.google.common.base.Optional**</exclude>
-                  </excludes>
-                </relocation>
-                <relocation>
-                  <pattern>org.apache.commons.math3</pattern>
-                  
<shadedPattern>org.spark-project.commons.math3</shadedPattern>
-                </relocation>
-              </relocations>
-              <transformers>
-                <transformer 
implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"
 />
-                <transformer 
implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
-                  <resource>reference.conf</resource>
-                </transformer>
-                <transformer 
implementation="org.apache.maven.plugins.shade.resource.DontIncludeResourceTransformer">
-                  <resource>log4j.properties</resource>
-                </transformer>
-              </transformers>
-            </configuration>
-          </execution>
-        </executions>
+        <configuration>
+          <shadedArtifactAttached>false</shadedArtifactAttached>
+          
<outputFile>${project.build.directory}/scala-${scala.binary.version}/spark-examples-${project.version}-hadoop${hadoop.version}.jar</outputFile>
+          <artifactSet>
+            <includes>
+              <include>*:*</include>
+            </includes>
+          </artifactSet>
+          <filters>
+            <filter>
+              <artifact>*:*</artifact>
+              <excludes>
+                <exclude>META-INF/*.SF</exclude>
+                <exclude>META-INF/*.DSA</exclude>
+                <exclude>META-INF/*.RSA</exclude>
+              </excludes>
+            </filter>
+          </filters>
+          <relocations combine.children="append">
+            <relocation>
+              <pattern>org.apache.commons.math3</pattern>
+              <shadedPattern>org.spark-project.commons.math3</shadedPattern>
+            </relocation>
+          </relocations>
+          <transformers>
+            <transformer 
implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"
 />
+            <transformer 
implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
+              <resource>reference.conf</resource>
+            </transformer>
+            <transformer 
implementation="org.apache.maven.plugins.shade.resource.DontIncludeResourceTransformer">
+              <resource>log4j.properties</resource>
+            </transformer>
+          </transformers>
+        </configuration>
       </plugin>
     </plugins>
   </build>

http://git-wip-us.apache.org/repos/asf/spark/blob/37a5e272/network/common/pom.xml
----------------------------------------------------------------------
diff --git a/network/common/pom.xml b/network/common/pom.xml
index 245a96b..5a9bbe1 100644
--- a/network/common/pom.xml
+++ b/network/common/pom.xml
@@ -48,10 +48,15 @@
       <artifactId>slf4j-api</artifactId>
       <scope>provided</scope>
     </dependency>
+    <!--
+      Promote Guava to "compile" so that maven-shade-plugin picks it up (for 
packaging the Optional
+      class exposed in the Java API). The plugin will then remove this 
dependency from the published
+      pom, so that Guava does not pollute the client's compilation classpath.
+    -->
     <dependency>
       <groupId>com.google.guava</groupId>
       <artifactId>guava</artifactId>
-      <scope>provided</scope>
+      <scope>compile</scope>
     </dependency>
 
     <!-- Test dependencies -->
@@ -88,11 +93,6 @@
         <version>2.2</version>
         <executions>
           <execution>
-            <goals>
-              <goal>test-jar</goal>
-            </goals>
-          </execution>
-          <execution>
             <id>test-jar-on-test-compile</id>
             <phase>test-compile</phase>
             <goals>
@@ -101,6 +101,18 @@
           </execution>
         </executions>
       </plugin>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-shade-plugin</artifactId>
+        <configuration>
+          <shadedArtifactAttached>false</shadedArtifactAttached>
+          <artifactSet>
+            <includes>
+              <include>com.google.guava:guava</include>
+            </includes>
+          </artifactSet>
+        </configuration>
+      </plugin>
     </plugins>
   </build>
 </project>

http://git-wip-us.apache.org/repos/asf/spark/blob/37a5e272/network/shuffle/pom.xml
----------------------------------------------------------------------
diff --git a/network/shuffle/pom.xml b/network/shuffle/pom.xml
index 5bfa1ac..c2d0300 100644
--- a/network/shuffle/pom.xml
+++ b/network/shuffle/pom.xml
@@ -52,7 +52,6 @@
     <dependency>
       <groupId>com.google.guava</groupId>
       <artifactId>guava</artifactId>
-      <scope>provided</scope>
     </dependency>
 
     <!-- Test dependencies -->

http://git-wip-us.apache.org/repos/asf/spark/blob/37a5e272/pom.xml
----------------------------------------------------------------------
diff --git a/pom.xml b/pom.xml
index 05cb379..4adfdf3 100644
--- a/pom.xml
+++ b/pom.xml
@@ -1264,7 +1264,10 @@
           </execution>
         </executions>
       </plugin>
-      <!-- The shade plug-in is used here to create effective pom's (see 
SPARK-3812). -->
+      <!--
+        The shade plug-in is used here to create effective pom's (see 
SPARK-3812), and also
+        remove references from the shaded libraries from artifacts published 
by Spark.
+      -->
       <plugin>
         <groupId>org.apache.maven.plugins</groupId>
         <artifactId>maven-shade-plugin</artifactId>
@@ -1276,6 +1279,23 @@
               <include>org.spark-project.spark:unused</include>
             </includes>
           </artifactSet>
+          <relocations>
+            <relocation>
+              <pattern>com.google.common</pattern>
+              <shadedPattern>org.spark-project.guava</shadedPattern>
+              <excludes>
+                <!--
+                  These classes cannot be relocated, because the Java API 
exposes the
+                  "Optional" type; the others are referenced by the Optional 
class.
+                -->
+                <exclude>com/google/common/base/Absent*</exclude>
+                <exclude>com/google/common/base/Function</exclude>
+                <exclude>com/google/common/base/Optional*</exclude>
+                <exclude>com/google/common/base/Present*</exclude>
+                <exclude>com/google/common/base/Supplier</exclude>
+              </excludes>
+            </relocation>
+          </relocations>
         </configuration>
         <executions>
           <execution>

http://git-wip-us.apache.org/repos/asf/spark/blob/37a5e272/streaming/pom.xml
----------------------------------------------------------------------
diff --git a/streaming/pom.xml b/streaming/pom.xml
index 22b0d71..98f5b41 100644
--- a/streaming/pom.xml
+++ b/streaming/pom.xml
@@ -95,6 +95,14 @@
           </execution>
         </executions>
       </plugin>
+
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-shade-plugin</artifactId>
+        <configuration>
+          <shadeTestJar>true</shadeTestJar>
+        </configuration>
+      </plugin>
     </plugins>
     <resources>
       <resource>


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to