Ottomata has submitted this change and it was merged.

Change subject: Ensure useful (mostly python) packages are on analytics worker 
and client nodes
......................................................................


Ensure useful (mostly python) packages are on analytics worker and client nodes

This will specifically help with use of pyspark mllib, but in general
it will be good to have all of these types of packages available for
general computation across the Analytics Cluster.

Change-Id: I748f5af91b81b3b477515563e195dacc828edbb2
---
M manifests/role/analytics.pp
M manifests/role/analytics/hadoop.pp
A manifests/role/analytics/packages.pp
3 files changed, 24 insertions(+), 7 deletions(-)

Approvals:
  Ottomata: Looks good to me, approved
  jenkins-bot: Verified



diff --git a/manifests/role/analytics.pp b/manifests/role/analytics.pp
index c3db210..9a9db6c 100644
--- a/manifests/role/analytics.pp
+++ b/manifests/role/analytics.pp
@@ -45,13 +45,6 @@
         require => Package['icedtea-7-jre-jamvm'],
     }
 
-    # jq is very useful, install it.
-    if !defined(Package['jq']) {
-        package { 'jq':
-            ensure => 'installed',
-        }
-    }
-
     # ipython-notebook is very useful, install it.
     if !defined(Package['ipython-notebook']) {
         package { 'ipython-notebook':
@@ -62,6 +55,8 @@
     # include maven to build jars for Hadoop.
     include ::maven
 
+    # install packages that should be on all hadoop worker nodes
+    include role::analytics::packages
 }
 
 # == Class role::analytics::hadoop::monitor_disks
diff --git a/manifests/role/analytics/hadoop.pp 
b/manifests/role/analytics/hadoop.pp
index e7d7f59..708a3e9 100644
--- a/manifests/role/analytics/hadoop.pp
+++ b/manifests/role/analytics/hadoop.pp
@@ -557,6 +557,9 @@
 
     # Install MaxMind databases for geocoding UDFs
     include geoip
+
+    # install packages that should be on all hadoop worker nodes
+    include role::analytics::packages
 }
 
 # == Class role::analytics::hadoop::monitor::nsca::client
diff --git a/manifests/role/analytics/packages.pp 
b/manifests/role/analytics/packages.pp
new file mode 100644
index 0000000..75bb7be
--- /dev/null
+++ b/manifests/role/analytics/packages.pp
@@ -0,0 +1,19 @@
+# == Class role::analytics::packages
+# This class should be included on all analytics
+# client and worker nodes.  It will install packages
+# that are useful for distributed computation
+# in Hadoop, and thus should be available on
+# any workers, and clients for testing.
+#
+class role::analytics::packages {
+    ensure_packages([
+        'python-numpy',
+        'python-pandas',
+        'python-scipy',
+        'python-requests',
+        'python-matplotlib',
+        'python-dateutil',
+        'python-sympy',
+        'jq',
+    ])
+}

-- 
To view, visit https://gerrit.wikimedia.org/r/203080
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I748f5af91b81b3b477515563e195dacc828edbb2
Gerrit-PatchSet: 2
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: Ottomata <o...@wikimedia.org>
Gerrit-Reviewer: Ottomata <o...@wikimedia.org>
Gerrit-Reviewer: jenkins-bot <>

_______________________________________________
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to