[jira] [Assigned] (SPARK-6511) Publish hadoop provided build with instructions for different distros
[ https://issues.apache.org/jira/browse/SPARK-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell reassigned SPARK-6511: -- Assignee: Patrick Wendell Publish hadoop provided build with instructions for different distros --- Key: SPARK-6511 URL: https://issues.apache.org/jira/browse/SPARK-6511 Project: Spark Issue Type: Improvement Components: Build Reporter: Patrick Wendell Assignee: Patrick Wendell Currently we publish a series of binaries with different Hadoop client jars. This mostly works, but some users have reported compatibility issues with different distributions. One improvement moving forward might be to publish a binary build that simply asks you to set HADOOP_HOME to pick up the Hadoop client location. That way it would work across multiple distributions, even if they have subtle incompatibilities with upstream Hadoop. I think a first step for this would be to produce such a build for the community and see how well it works. One potential issue is that our fancy excludes and dependency re-writing won't work with the simpler append Hadoop's classpath to Spark. Also, how we deal with the Hive dependency is unclear, i.e. should we continue to bundle Spark's Hive (which has some fixes for dependency conflicts) or do we allow for linking against vanilla Hive at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-6511) Publish hadoop provided build with instructions for different distros
[ https://issues.apache.org/jira/browse/SPARK-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6511: --- Assignee: Apache Spark (was: Patrick Wendell) Publish hadoop provided build with instructions for different distros --- Key: SPARK-6511 URL: https://issues.apache.org/jira/browse/SPARK-6511 Project: Spark Issue Type: Improvement Components: Build Reporter: Patrick Wendell Assignee: Apache Spark Currently we publish a series of binaries with different Hadoop client jars. This mostly works, but some users have reported compatibility issues with different distributions. One improvement moving forward might be to publish a binary build that simply asks you to set HADOOP_HOME to pick up the Hadoop client location. That way it would work across multiple distributions, even if they have subtle incompatibilities with upstream Hadoop. I think a first step for this would be to produce such a build for the community and see how well it works. One potential issue is that our fancy excludes and dependency re-writing won't work with the simpler append Hadoop's classpath to Spark. Also, how we deal with the Hive dependency is unclear, i.e. should we continue to bundle Spark's Hive (which has some fixes for dependency conflicts) or do we allow for linking against vanilla Hive at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-6511) Publish hadoop provided build with instructions for different distros
[ https://issues.apache.org/jira/browse/SPARK-6511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-6511: --- Assignee: Patrick Wendell (was: Apache Spark) Publish hadoop provided build with instructions for different distros --- Key: SPARK-6511 URL: https://issues.apache.org/jira/browse/SPARK-6511 Project: Spark Issue Type: Improvement Components: Build Reporter: Patrick Wendell Assignee: Patrick Wendell Currently we publish a series of binaries with different Hadoop client jars. This mostly works, but some users have reported compatibility issues with different distributions. One improvement moving forward might be to publish a binary build that simply asks you to set HADOOP_HOME to pick up the Hadoop client location. That way it would work across multiple distributions, even if they have subtle incompatibilities with upstream Hadoop. I think a first step for this would be to produce such a build for the community and see how well it works. One potential issue is that our fancy excludes and dependency re-writing won't work with the simpler append Hadoop's classpath to Spark. Also, how we deal with the Hive dependency is unclear, i.e. should we continue to bundle Spark's Hive (which has some fixes for dependency conflicts) or do we allow for linking against vanilla Hive at runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org