[MediaWiki-commits] [Gerrit] Using proper key for offset path config - change (analytics/kafkatee)
Ottomata has submitted this change and it was merged. Change subject: Using proper key for offset path config .. Using proper key for offset path config Change-Id: I6840d1ad6e17506baf36d5edcca59e8e9fb85dda --- M kafkatee.conf.example 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/kafkatee.conf.example b/kafkatee.conf.example index c4752b0..60e6065 100644 --- a/kafkatee.conf.example +++ b/kafkatee.conf.example @@ -106,7 +106,7 @@ # Offset file directory. # Each topic + partition combination has its own offset file. # Default: current directory -#kafka.offset.store.path = /var/run/offsets/ +#kafka.topic.offset.store.path = /var/run/offsets/ # If the request offset was not found on broker, or there is no # initial offset known (no stored offset), then reset the offset according -- To view, visit https://gerrit.wikimedia.org/r/111493 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I6840d1ad6e17506baf36d5edcca59e8e9fb85dda Gerrit-PatchSet: 1 Gerrit-Project: analytics/kafkatee Gerrit-Branch: master Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Using proper key for offset path config - change (analytics/kafkatee)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/111493 Change subject: Using proper key for offset path config .. Using proper key for offset path config Change-Id: I6840d1ad6e17506baf36d5edcca59e8e9fb85dda --- M kafkatee.conf.example 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/analytics/kafkatee refs/changes/93/111493/1 diff --git a/kafkatee.conf.example b/kafkatee.conf.example index c4752b0..60e6065 100644 --- a/kafkatee.conf.example +++ b/kafkatee.conf.example @@ -106,7 +106,7 @@ # Offset file directory. # Each topic + partition combination has its own offset file. # Default: current directory -#kafka.offset.store.path = /var/run/offsets/ +#kafka.topic.offset.store.path = /var/run/offsets/ # If the request offset was not found on broker, or there is no # initial offset known (no stored offset), then reset the offset according -- To view, visit https://gerrit.wikimedia.org/r/111493 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I6840d1ad6e17506baf36d5edcca59e8e9fb85dda Gerrit-PatchSet: 1 Gerrit-Project: analytics/kafkatee Gerrit-Branch: master Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] emery: RT #6143 move two logs to erbium - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: emery: RT #6143 move two logs to erbium .. emery: RT #6143 move two logs to erbium Change-Id: I248299864ffc89ba3547444f4f2c7419e9f2c646 --- M manifests/misc/statistics.pp M templates/udp2log/filters.emery.erb M templates/udp2log/filters.erbium.erb 3 files changed, 9 insertions(+), 4 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/misc/statistics.pp b/manifests/misc/statistics.pp index 1852091..d1ae628 100644 --- a/manifests/misc/statistics.pp +++ b/manifests/misc/statistics.pp @@ -536,7 +536,7 @@ # sampled-1000 logs from emery misc::statistics::rsync_job { "sampled_1000": -source => "emery.wikimedia.org::udp2log/webrequest/archive/sampled-1000*.gz", +source => "erbium.wikimedia.org::udp2log/webrequest/archive/sampled-1000*.gz", destination => "/a/squid/archive/sampled", } diff --git a/templates/udp2log/filters.emery.erb b/templates/udp2log/filters.emery.erb index 9b42b3b..6784566 100644 --- a/templates/udp2log/filters.emery.erb +++ b/templates/udp2log/filters.emery.erb @@ -9,5 +9,3 @@ # This log file is also on gadolinium for redundancy file 1000 <%= log_directory %>/sampled-1000.tsv.log -## This feeds all http related graphs in graphite / gdash.wikimedia.org -pipe 2 /usr/local/bin/sqstat 2 diff --git a/templates/udp2log/filters.erbium.erb b/templates/udp2log/filters.erbium.erb index a6528cc..9a926fe 100644 --- a/templates/udp2log/filters.erbium.erb +++ b/templates/udp2log/filters.erbium.erb @@ -24,4 +24,11 @@ pipe 100 /usr/bin/udp-filter -F '\t' -p /w/api.php >> <%= webrequest_log_directory %>/api-usage.tsv.log ### GLAM NARA / National Archives - RT 2212 -#pipe 10 /usr/bin/udp-filter -F '\t' -p _NARA_ -g -b country >> <%=log_directory %>/glam_nara.tsv.log +pipe 10 /usr/bin/udp-filter -F '\t' -p _NARA_ -g -b country >> <%=log_directory %>/glam_nara.tsv.log + +### This feeds all http related graphs in graphite / gdash.wikimedia.org +pipe 2 /usr/local/bin/sqstat 2 + +### 0.0001 of all udp2log messages +## This log file is also on gadolinium for redundancy +file 1000 <%= log_directory %>/sampled-1000.tsv.log -- To view, visit https://gerrit.wikimedia.org/r/110382 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I248299864ffc89ba3547444f4f2c7419e9f2c646 Gerrit-PatchSet: 3 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Matanya Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot <> ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Parameterizing queue_buffering_max_ms and batch_num_messages - change (operations...varnishkafka)
Ottomata has submitted this change and it was merged. Change subject: Parameterizing queue_buffering_max_ms and batch_num_messages .. Parameterizing queue_buffering_max_ms and batch_num_messages Change-Id: I2181807d72e8dc9c2a99247b9bc739119704da62 --- M manifests/defaults.pp M manifests/init.pp M templates/varnishkafka.conf.erb 3 files changed, 18 insertions(+), 1 deletion(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/defaults.pp b/manifests/defaults.pp index bacd4ad..a8f76a9 100644 --- a/manifests/defaults.pp +++ b/manifests/defaults.pp @@ -14,6 +14,8 @@ $partition = -1 $queue_buffering_max_messages = 10 +$queue_buffering_max_ms = 1000 +$batch_num_messages = 1000 $message_send_max_retries = 3 $topic_request_required_acks= 1 $topic_message_timeout_ms = 30 diff --git a/manifests/init.pp b/manifests/init.pp index 3a30b02..21f3593 100644 --- a/manifests/init.pp +++ b/manifests/init.pp @@ -22,9 +22,13 @@ # $format_key - Kafka message key format string. # Default: undef (disables Kafka message key usage). # $partition- Topic partition number to send to. -1 for random. -# Default: -1. +# Default: -1 # $queue_buffering_max_messages - Maximum number of messages allowed on the # local Kafka producer queue. Default: 10 +# $queue_buffering_max_ms - Maximum time, in milliseconds, for buffering +# data on the producer queue. Default: 1000 +# $batch_num_messages - Maximum number of messages batched in one MessageSet. +# Default: 1000 # $message_send_max_retries - Maximum number of retries per messageset. # Default: 3 # $topic_request_required_acks - Required ack level. Default: 1 @@ -79,6 +83,9 @@ $partition = $varnishkafka::defaults::partition, $queue_buffering_max_messages = $varnishkafka::defaults::queue_buffering_max_messages, +$queue_buffering_max_ms = $varnishkafka::defaults::queue_buffering_max_ms, +$batch_num_messages = $varnishkafka::defaults::batch_num_messages, + $message_send_max_retries = $varnishkafka::defaults::message_send_max_retries, $topic_request_required_acks= $varnishkafka::defaults::topic_request_required_acks, $topic_message_timeout_ms = $varnishkafka::defaults::topic_message_timeout_ms, diff --git a/templates/varnishkafka.conf.erb b/templates/varnishkafka.conf.erb index a438234..a6df726 100644 --- a/templates/varnishkafka.conf.erb +++ b/templates/varnishkafka.conf.erb @@ -251,6 +251,14 @@ # Defaults to 100 kafka.queue.buffering.max.messages = <%= @queue_buffering_max_messages %> +# Maximum time, in milliseconds, for buffering data on the producer queue. +# Defaults to 1000 (1 second) +kafka.queue.buffering.max.ms = <%= @queue_buffering_max_ms %> + +# Maximum number of messages batched in one MessageSet. +# Defaults to 1000 +kafka.batch.num.messages = <%= @batch_num_messages %> + # Maximum number of retries per messageset. kafka.message.send.max.retries = <%= @message_send_max_retries %> -- To view, visit https://gerrit.wikimedia.org/r/111455 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I2181807d72e8dc9c2a99247b9bc739119704da62 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet/varnishkafka Gerrit-Branch: master Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] [DO NOT MERGE] Setting batch_num_messages to 6000 - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/111523 Change subject: [DO NOT MERGE] Setting batch_num_messages to 6000 .. [DO NOT MERGE] Setting batch_num_messages to 6000 Don't merge this yet. Magnus and are are trying to catch cp3019 misbehaving again, and we think that this setting might fix it. Change-Id: I2243fc06f410a5bdf8a8ce489e3a3b1fe16ecb2c --- M manifests/role/cache.pp M modules/varnishkafka 2 files changed, 4 insertions(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/23/111523/1 diff --git a/manifests/role/cache.pp b/manifests/role/cache.pp index aed7e3c..af24a20 100644 --- a/manifests/role/cache.pp +++ b/manifests/role/cache.pp @@ -429,6 +429,10 @@ format => "%{fake_tag0@hostname?${::fqdn}}x %{@sequence!num?0}n %{%FT%T@dt}t %{Varnish:time_firstbyte@time_firstbyte!num?0.0}x %{@ip}h %{Varnish:handling@cache_status}x %{@http_status}s %{@response_size!num?0}b %{@http_method}m %{Host@uri_host}i %{@uri_path}U %{@uri_query}q %{Content-Type@content_type}o %{Referer@referer}i %{X-Forwarded-For@x_forwarded_for}i %{User-Agent@user_agent}i %{Accept-Language@accept_language}i %{X-Analytics@x_analytics}o", message_send_max_retries => 3, queue_buffering_max_messages => 200, +# bits varnishes do about 6000 reqs / sec each. +# We want to buffer for about max 1 second. +batch_num_messages => 6000, + # large timeout to account for potential cross DC latencies topic_request_timeout_ms => 3, # request ack timeout # Write out stats to varnishkafka.stats.json diff --git a/modules/varnishkafka b/modules/varnishkafka index bab034a..04ece4f 16 --- a/modules/varnishkafka +++ b/modules/varnishkafka -Subproject commit bab034a007f665303bc3c3fad48e34eccbb3648c +Subproject commit 04ece4fb9de20929eaaad0c7a26100d424db22e4 -- To view, visit https://gerrit.wikimedia.org/r/111523 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I2243fc06f410a5bdf8a8ce489e3a3b1fe16ecb2c Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding Replica-MaxLag to Ganglia kafka view - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/111617 Change subject: Adding Replica-MaxLag to Ganglia kafka view .. Adding Replica-MaxLag to Ganglia kafka view Change-Id: I76e183b523ba09000499c6509342fae346081a8a --- M manifests/misc/monitoring.pp 1 file changed, 6 insertions(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/17/111617/1 diff --git a/manifests/misc/monitoring.pp b/manifests/misc/monitoring.pp index eff8f07..a35e20c 100644 --- a/manifests/misc/monitoring.pp +++ b/manifests/misc/monitoring.pp @@ -229,6 +229,12 @@ 'type' => 'stack', }, +# Replica Max Lag +{ +'host_regex' => $kafka_broker_host_regex, +'metric_regex' => 'kafka.server.ReplicaFetcherManager.Replica-MaxLag.Value', +'type' => 'line', +}, # Under Replicated Partitions { 'host_regex' => $kafka_broker_host_regex, -- To view, visit https://gerrit.wikimedia.org/r/111617 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I76e183b523ba09000499c6509342fae346081a8a Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding Replica-MaxLag to Ganglia kafka view - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Adding Replica-MaxLag to Ganglia kafka view .. Adding Replica-MaxLag to Ganglia kafka view Change-Id: I76e183b523ba09000499c6509342fae346081a8a --- M manifests/misc/monitoring.pp 1 file changed, 6 insertions(+), 0 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/misc/monitoring.pp b/manifests/misc/monitoring.pp index eff8f07..a35e20c 100644 --- a/manifests/misc/monitoring.pp +++ b/manifests/misc/monitoring.pp @@ -229,6 +229,12 @@ 'type' => 'stack', }, +# Replica Max Lag +{ +'host_regex' => $kafka_broker_host_regex, +'metric_regex' => 'kafka.server.ReplicaFetcherManager.Replica-MaxLag.Value', +'type' => 'line', +}, # Under Replicated Partitions { 'host_regex' => $kafka_broker_host_regex, -- To view, visit https://gerrit.wikimedia.org/r/111617 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I76e183b523ba09000499c6509342fae346081a8a Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Parameterizing num.replica.fetchers and replica.fetch.max.bytes - change (operations...kafka)
Ottomata has submitted this change and it was merged. Change subject: Parameterizing num.replica.fetchers and replica.fetch.max.bytes .. Parameterizing num.replica.fetchers and replica.fetch.max.bytes Change-Id: I1e8726a094fb930c8fc2b04687c24bb703d013c7 --- M manifests/defaults.pp M manifests/server.pp M templates/server.properties.erb 3 files changed, 21 insertions(+), 0 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/manifests/defaults.pp b/manifests/defaults.pp index 7dafcc5..17e2193 100644 --- a/manifests/defaults.pp +++ b/manifests/defaults.pp @@ -31,6 +31,8 @@ $nofiles_ulimit = 8192 $auto_create_topics_enable = false +$num_replica_fetchers= 1 +$replica_fetch_max_bytes = 1048576 $num_network_threads = 2 $num_io_threads = 2 diff --git a/manifests/server.pp b/manifests/server.pp index 9d15246..d55591e 100644 --- a/manifests/server.pp +++ b/manifests/server.pp @@ -39,6 +39,13 @@ # # $auto_create_topics_enable- If autocreation of topics is allowed. Default: false # +# $num_replica_fetchers - Number of threads used to replicate messages from leaders. +# Default: 1 +# +# $replica_fetch_max_bytes - The number of bytes of messages to attempt to fetch for each +# partition in the fetch requests the replicas send to the leader. +# Default: 1024 * 1024 +# # $num_network_threads - The number of threads handling network # requests. Default: 2 # @@ -99,6 +106,8 @@ $nofiles_ulimit = $kafka::defaults::nofiles_ulimit, $auto_create_topics_enable = $kafka::defaults::auto_create_topics_enable, +$num_replica_fetchers= $kafka::defaults::num_replica_fetchers, +$replica_fetch_max_bytes = $kafka::defaults::replica_fetch_max_bytes, $num_network_threads = $kafka::defaults::num_network_threads, $num_io_threads = $kafka::defaults::num_io_threads, diff --git a/templates/server.properties.erb b/templates/server.properties.erb index 3443677..0e5618f 100644 --- a/templates/server.properties.erb +++ b/templates/server.properties.erb @@ -56,6 +56,16 @@ # and number of partitions. auto.create.topics.enable=<%= @auto_create_topics_enable ? 'true' : 'false' %> +# Number of threads used to replicate messages from leaders. Increasing this +# value can increase the degree of I/O parallelism in the follower broker. +# This is useful to temporarily increase if you have a broker that needs +# to catch up on messages to get back into the ISR. +num.replica.fetchers=<%= @num_replica_fetchers %> + +# The number of byes of messages to attempt to fetch for each partition in the +# fetch requests the replicas send to the leader. +replica.fetch.max.bytes=<%= @replica_fetch_max_bytes %> + # Log Flush Policy # # The following configurations control the flush of data to disk. This is the most -- To view, visit https://gerrit.wikimedia.org/r/111812 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I1e8726a094fb930c8fc2b04687c24bb703d013c7 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet/kafka Gerrit-Branch: master Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot <> ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Parameterizing num.replica.fetchers and replica.fetch.max.bytes - change (operations...kafka)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/111812 Change subject: Parameterizing num.replica.fetchers and replica.fetch.max.bytes .. Parameterizing num.replica.fetchers and replica.fetch.max.bytes Change-Id: I1e8726a094fb930c8fc2b04687c24bb703d013c7 --- M manifests/defaults.pp M manifests/server.pp M templates/server.properties.erb 3 files changed, 21 insertions(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet/kafka refs/changes/12/111812/1 diff --git a/manifests/defaults.pp b/manifests/defaults.pp index 7dafcc5..17e2193 100644 --- a/manifests/defaults.pp +++ b/manifests/defaults.pp @@ -31,6 +31,8 @@ $nofiles_ulimit = 8192 $auto_create_topics_enable = false +$num_replica_fetchers= 1 +$replica_fetch_max_bytes = 1048576 $num_network_threads = 2 $num_io_threads = 2 diff --git a/manifests/server.pp b/manifests/server.pp index 9d15246..d55591e 100644 --- a/manifests/server.pp +++ b/manifests/server.pp @@ -39,6 +39,13 @@ # # $auto_create_topics_enable- If autocreation of topics is allowed. Default: false # +# $num_replica_fetchers - Number of threads used to replicate messages from leaders. +# Default: 1 +# +# $replica_fetch_max_bytes - The number of bytes of messages to attempt to fetch for each +# partition in the fetch requests the replicas send to the leader. +# Default: 1024 * 1024 +# # $num_network_threads - The number of threads handling network # requests. Default: 2 # @@ -99,6 +106,8 @@ $nofiles_ulimit = $kafka::defaults::nofiles_ulimit, $auto_create_topics_enable = $kafka::defaults::auto_create_topics_enable, +$num_replica_fetchers= $kafka::defaults::num_replica_fetchers, +$replica_fetch_max_bytes = $kafka::defaults::replica_fetch_max_bytes, $num_network_threads = $kafka::defaults::num_network_threads, $num_io_threads = $kafka::defaults::num_io_threads, diff --git a/templates/server.properties.erb b/templates/server.properties.erb index 3443677..0e5618f 100644 --- a/templates/server.properties.erb +++ b/templates/server.properties.erb @@ -56,6 +56,16 @@ # and number of partitions. auto.create.topics.enable=<%= @auto_create_topics_enable ? 'true' : 'false' %> +# Number of threads used to replicate messages from leaders. Increasing this +# value can increase the degree of I/O parallelism in the follower broker. +# This is useful to temporarily increase if you have a broker that needs +# to catch up on messages to get back into the ISR. +num.replica.fetchers=<%= @num_replica_fetchers %> + +# The number of byes of messages to attempt to fetch for each partition in the +# fetch requests the replicas send to the leader. +replica.fetch.max.bytes=<%= @replica_fetch_max_bytes %> + # Log Flush Policy # # The following configurations control the flush of data to disk. This is the most -- To view, visit https://gerrit.wikimedia.org/r/111812 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I1e8726a094fb930c8fc2b04687c24bb703d013c7 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet/kafka Gerrit-Branch: master Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Increasing num_replica_fetchers to 2 on kafka brokers - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Increasing num_replica_fetchers to 2 on kafka brokers .. Increasing num_replica_fetchers to 2 on kafka brokers Change-Id: I6f46006079fff26cfe071b61d351a9b0d0d77a57 --- M manifests/role/analytics/kafka.pp M modules/kafka 2 files changed, 8 insertions(+), 5 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/manifests/role/analytics/kafka.pp b/manifests/role/analytics/kafka.pp index 0d106bf..6108416 100644 --- a/manifests/role/analytics/kafka.pp +++ b/manifests/role/analytics/kafka.pp @@ -106,11 +106,14 @@ # class role::analytics::kafka::server inherits role::analytics::kafka::client { class { '::kafka::server': -log_dirs=> $log_dirs, -brokers => $brokers, -zookeeper_hosts => $zookeeper_hosts, -zookeeper_chroot=> $zookeeper_chroot, -nofiles_ulimit => $nofiles_ulimit, +log_dirs => $log_dirs, +brokers => $brokers, +zookeeper_hosts => $zookeeper_hosts, +zookeeper_chroot => $zookeeper_chroot, +nofiles_ulimit => $nofiles_ulimit, +# bump this up to 2 to get a little more +# parallelism between replicas. +num_replica_fetchers => 2, } # Generate icinga alert if Kafka Server is not running. diff --git a/modules/kafka b/modules/kafka index f7b6606..e18f734 16 --- a/modules/kafka +++ b/modules/kafka -Subproject commit f7b6606c617a53a47bce46a755da12fe87a114cb +Subproject commit e18f734ec0446c7f2f1221f3b97e1edaddcaded8 -- To view, visit https://gerrit.wikimedia.org/r/111816 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I6f46006079fff26cfe071b61d351a9b0d0d77a57 Gerrit-PatchSet: 2 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot <> ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Increasing num_replica_fetchers to 2 on kafka brokers - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/111816 Change subject: Increasing num_replica_fetchers to 2 on kafka brokers .. Increasing num_replica_fetchers to 2 on kafka brokers Change-Id: I6f46006079fff26cfe071b61d351a9b0d0d77a57 --- M manifests/role/analytics/kafka.pp M modules/kafka 2 files changed, 8 insertions(+), 5 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/16/111816/1 diff --git a/manifests/role/analytics/kafka.pp b/manifests/role/analytics/kafka.pp index 0d106bf..6108416 100644 --- a/manifests/role/analytics/kafka.pp +++ b/manifests/role/analytics/kafka.pp @@ -106,11 +106,14 @@ # class role::analytics::kafka::server inherits role::analytics::kafka::client { class { '::kafka::server': -log_dirs=> $log_dirs, -brokers => $brokers, -zookeeper_hosts => $zookeeper_hosts, -zookeeper_chroot=> $zookeeper_chroot, -nofiles_ulimit => $nofiles_ulimit, +log_dirs => $log_dirs, +brokers => $brokers, +zookeeper_hosts => $zookeeper_hosts, +zookeeper_chroot => $zookeeper_chroot, +nofiles_ulimit => $nofiles_ulimit, +# bump this up to 2 to get a little more +# parallelism between replicas. +num_replica_fetchers => 2, } # Generate icinga alert if Kafka Server is not running. diff --git a/modules/kafka b/modules/kafka index f7b6606..e18f734 16 --- a/modules/kafka +++ b/modules/kafka -Subproject commit f7b6606c617a53a47bce46a755da12fe87a114cb +Subproject commit e18f734ec0446c7f2f1221f3b97e1edaddcaded8 -- To view, visit https://gerrit.wikimedia.org/r/111816 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I6f46006079fff26cfe071b61d351a9b0d0d77a57 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Fixing domain name for erbium host - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Fixing domain name for erbium host .. Fixing domain name for erbium host Change-Id: Ib322242615d010f9d9b98cf5c606c40fce425d6e --- M manifests/misc/statistics.pp 1 file changed, 2 insertions(+), 2 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/misc/statistics.pp b/manifests/misc/statistics.pp index d1ae628..10332cd 100644 --- a/manifests/misc/statistics.pp +++ b/manifests/misc/statistics.pp @@ -530,13 +530,13 @@ # API logs from emery misc::statistics::rsync_job { "api": -source => "erbium.wikimedia.org::udp2log/webrequest/archive/api-usage*.gz", +source => "erbium.eqiad.wmnet::udp2log/webrequest/archive/api-usage*.gz", destination => "/a/squid/archive/api", } # sampled-1000 logs from emery misc::statistics::rsync_job { "sampled_1000": -source => "erbium.wikimedia.org::udp2log/webrequest/archive/sampled-1000*.gz", +source => "erbium.eqiad.wmnet::udp2log/webrequest/archive/sampled-1000*.gz", destination => "/a/squid/archive/sampled", } -- To view, visit https://gerrit.wikimedia.org/r/111830 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ib322242615d010f9d9b98cf5c606c40fce425d6e Gerrit-PatchSet: 2 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Nuria Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot <> ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Reflect emery -> erbium move in documentation - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Reflect emery -> erbium move in documentation .. Reflect emery -> erbium move in documentation Change-Id: I6548ca51584f65b907aad4269e9c9c2e84350b66 --- M manifests/misc/statistics.pp 1 file changed, 2 insertions(+), 2 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved Matanya: Looks good to me, but someone else must approve jenkins-bot: Verified diff --git a/manifests/misc/statistics.pp b/manifests/misc/statistics.pp index 10332cd..89823c2 100644 --- a/manifests/misc/statistics.pp +++ b/manifests/misc/statistics.pp @@ -528,13 +528,13 @@ destination => "/a/squid/archive/zero", } -# API logs from emery +# API logs from erbium misc::statistics::rsync_job { "api": source => "erbium.eqiad.wmnet::udp2log/webrequest/archive/api-usage*.gz", destination => "/a/squid/archive/api", } -# sampled-1000 logs from emery +# sampled-1000 logs from erbium misc::statistics::rsync_job { "sampled_1000": source => "erbium.eqiad.wmnet::udp2log/webrequest/archive/sampled-1000*.gz", destination => "/a/squid/archive/sampled", -- To view, visit https://gerrit.wikimedia.org/r/111841 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I6548ca51584f65b907aad4269e9c9c2e84350b66 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: QChris Gerrit-Reviewer: Matanya Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot <> ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding alerts for Kafka Broker replication metrics - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/112065 Change subject: Adding alerts for Kafka Broker replication metrics .. Adding alerts for Kafka Broker replication metrics Change-Id: I3116f74b7fcf6a66f865f4c5acba98593a513ffe --- M manifests/role/analytics/kafka.pp 1 file changed, 26 insertions(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/65/112065/1 diff --git a/manifests/role/analytics/kafka.pp b/manifests/role/analytics/kafka.pp index 6108416..b7dcbeb 100644 --- a/manifests/role/analytics/kafka.pp +++ b/manifests/role/analytics/kafka.pp @@ -171,5 +171,30 @@ critical=> ':1000.0', require => Class['::kafka::server::jmxtrans'], } -} +# Alert if any Kafka has under replicated partitions. +# If it does, this means a broker replica is falling behind +# and will be removed from the ISR. +monitor_ganglia { 'kafka-broker-UnderReplicatedPartitions': +description => 'Kafka Broker Under Replicated Partitions', +metric => 'kafka.server.ReplicaManager.UnderReplicatedPartitions.Value', +# Any under replicated partitions are bad. +# Over 10 means (probably) that at least an entire topic +# is under replicated. +warning => '1', +critical=> '10', +require => Class['::kafka::server::jmxtrans'], +} + +# Alert if any Kafka Broker replica lag is too high +monitor_ganglia { 'kafka-broker-Replica-MaxLag': +description => 'Kafka Broker Replica Lag', +metric => 'kafka.server.ReplicaFetcherManager.Replica-MaxLag.Value', +# As of 2014-02 replag could catch up at more than 1000 msgs / sec, +# (probably more like 2 or 3 K / second). At that rate, 1M messages +# behind should catch back up in at least 30 minutes. +warning => '100', +critical=> '500', +require => Class['::kafka::server::jmxtrans'], +} +} -- To view, visit https://gerrit.wikimedia.org/r/112065 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I3116f74b7fcf6a66f865f4c5acba98593a513ffe Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding alerts for Kafka Broker replication metrics - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Adding alerts for Kafka Broker replication metrics .. Adding alerts for Kafka Broker replication metrics Change-Id: I3116f74b7fcf6a66f865f4c5acba98593a513ffe --- M manifests/role/analytics/kafka.pp 1 file changed, 26 insertions(+), 1 deletion(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/manifests/role/analytics/kafka.pp b/manifests/role/analytics/kafka.pp index 6108416..b7dcbeb 100644 --- a/manifests/role/analytics/kafka.pp +++ b/manifests/role/analytics/kafka.pp @@ -171,5 +171,30 @@ critical=> ':1000.0', require => Class['::kafka::server::jmxtrans'], } -} +# Alert if any Kafka has under replicated partitions. +# If it does, this means a broker replica is falling behind +# and will be removed from the ISR. +monitor_ganglia { 'kafka-broker-UnderReplicatedPartitions': +description => 'Kafka Broker Under Replicated Partitions', +metric => 'kafka.server.ReplicaManager.UnderReplicatedPartitions.Value', +# Any under replicated partitions are bad. +# Over 10 means (probably) that at least an entire topic +# is under replicated. +warning => '1', +critical=> '10', +require => Class['::kafka::server::jmxtrans'], +} + +# Alert if any Kafka Broker replica lag is too high +monitor_ganglia { 'kafka-broker-Replica-MaxLag': +description => 'Kafka Broker Replica Lag', +metric => 'kafka.server.ReplicaFetcherManager.Replica-MaxLag.Value', +# As of 2014-02 replag could catch up at more than 1000 msgs / sec, +# (probably more like 2 or 3 K / second). At that rate, 1M messages +# behind should catch back up in at least 30 minutes. +warning => '100', +critical=> '500', +require => Class['::kafka::server::jmxtrans'], +} +} -- To view, visit https://gerrit.wikimedia.org/r/112065 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I3116f74b7fcf6a66f865f4c5acba98593a513ffe Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot <> ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] File outputs were erroneously closed in outputs_check - change (analytics/kafkatee)
Ottomata has submitted this change and it was merged. Change subject: File outputs were erroneously closed in outputs_check .. File outputs were erroneously closed in outputs_check Change-Id: I8383ad57ad66314acf8658288c481d249ed9d834 --- M output.c 1 file changed, 5 insertions(+), 4 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/output.c b/output.c index 8e9def7..4875599 100644 --- a/output.c +++ b/output.c @@ -356,10 +356,11 @@ /* Already open */ if (o->o_fd != -1) { - if (o->o_pipe.status != -1) - output_close(o, LOG_WARNING, -exec_exitstatus(o->o_pipe.status)); - continue; +if (o->o_type == OUTPUT_PIPE && +o->o_pipe.status != -1) +output_close(o, LOG_WARNING, + exec_exitstatus(o->o_pipe.status)); +continue; } /* Last open failed: back off */ -- To view, visit https://gerrit.wikimedia.org/r/112467 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I8383ad57ad66314acf8658288c481d249ed9d834 Gerrit-PatchSet: 1 Gerrit-Project: analytics/kafkatee Gerrit-Branch: master Gerrit-Owner: Edenhill Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding new research leila, giving access to stat1 and stat10... - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/112497 Change subject: Adding new research leila, giving access to stat1 and stat1002 and bastions .. Adding new research leila, giving access to stat1 and stat1002 and bastions RT 6765 Change-Id: I60633e881d7b497a12ebaf95ffc348f27ce0ff87 --- M manifests/admins.pp M manifests/site.pp 2 files changed, 25 insertions(+), 2 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/97/112497/1 diff --git a/manifests/admins.pp b/manifests/admins.pp index 54154ee..3b4469d 100644 --- a/manifests/admins.pp +++ b/manifests/admins.pp @@ -3382,6 +3382,26 @@ } } +# RT 6765 +class leila inherits baseaccount { +$username = 'leila' +$realname = 'Leila Zia' +$uid = 3963 + +unixaccount { $realname: username => $username, uid => $uid, gid => $gid } + +if $manage_home { +Ssh_authorized_key { require => Unixaccount[$realname] } + +ssh_authorized_key { 'leila@starfruit': +ensure => 'present', +user => $username, +type => 'ssh-rsa', +key=> 'B3NzaC1yc2EDAQABAAABAQCyDhTiTa+lUt+lM++HXAYchRyKX4GVMwb4zAAovcHbG9R7NHAP1vT7px+vwFG69TZay/MsuZ7oo5NyRUWNF00CXSSx0KMZz5FirW/dncrRG9/N+fxat8jyjVVrFiY1sngSUhmILQrLGV0Wa7EC8ZHv0qywO4UqbfgGxZMY5n2nu3hFvLn6LoKKoNDjaFTfEwio8QNjdMC0NZLYqUk1HMj5Zm4mrTFD+UcOXSbbOe4MytQKDYzZdEYd4XOE1ki/dRvAmPhAj0gAkezPCRseCCamaDmokd+PS8db3EHJ390+48FTkXLIO1uUhJJmF9MsWL2dj2gDk1RZjkOlfcAapypl', +} +} +} + # FIXME: not an admin. This is more like a system account. class l10nupdate inherits baseaccount { $username = "l10nupdate" @@ -3532,6 +3552,7 @@ include accounts::ironholds # RT 5935 include accounts::nuria # RT 6535 include accounts::csalvia # RT 6664 + include accounts::leila # RT 6765 } class admins::labs { @@ -3590,7 +3611,8 @@ accounts::qchris, # RT 5474 accounts::tnegrin, # RT 5391 accounts::nuria,# RT 6617 - accounts::csalvia # RT 6664 + accounts::csalvia, # RT 6664 + accounts::leila # RT 6765 } class admins::fr-tech { diff --git a/manifests/site.pp b/manifests/site.pp index 4aef642..21fa38f 100644 --- a/manifests/site.pp +++ b/manifests/site.pp @@ -2278,7 +2278,8 @@ accounts::mholmquist,# RT 6009 accounts::msyed, # RT 6506 accounts::nuria, # RT 6525 -accounts::csalvia# RT 6664 +accounts::csalvia, # RT 6664 +accounts::leila # RT 6765 sudo_user { "otto": privileges => ['ALL = NOPASSWD: ALL'] } -- To view, visit https://gerrit.wikimedia.org/r/112497 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I60633e881d7b497a12ebaf95ffc348f27ce0ff87 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding new research leila, giving access to stat1 and stat10... - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Adding new research leila, giving access to stat1 and stat1002 and bastions .. Adding new research leila, giving access to stat1 and stat1002 and bastions RT 6765 Change-Id: I60633e881d7b497a12ebaf95ffc348f27ce0ff87 --- M manifests/admins.pp M manifests/site.pp 2 files changed, 25 insertions(+), 2 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/manifests/admins.pp b/manifests/admins.pp index 54154ee..3b4469d 100644 --- a/manifests/admins.pp +++ b/manifests/admins.pp @@ -3382,6 +3382,26 @@ } } +# RT 6765 +class leila inherits baseaccount { +$username = 'leila' +$realname = 'Leila Zia' +$uid = 3963 + +unixaccount { $realname: username => $username, uid => $uid, gid => $gid } + +if $manage_home { +Ssh_authorized_key { require => Unixaccount[$realname] } + +ssh_authorized_key { 'leila@starfruit': +ensure => 'present', +user => $username, +type => 'ssh-rsa', +key=> 'B3NzaC1yc2EDAQABAAABAQCyDhTiTa+lUt+lM++HXAYchRyKX4GVMwb4zAAovcHbG9R7NHAP1vT7px+vwFG69TZay/MsuZ7oo5NyRUWNF00CXSSx0KMZz5FirW/dncrRG9/N+fxat8jyjVVrFiY1sngSUhmILQrLGV0Wa7EC8ZHv0qywO4UqbfgGxZMY5n2nu3hFvLn6LoKKoNDjaFTfEwio8QNjdMC0NZLYqUk1HMj5Zm4mrTFD+UcOXSbbOe4MytQKDYzZdEYd4XOE1ki/dRvAmPhAj0gAkezPCRseCCamaDmokd+PS8db3EHJ390+48FTkXLIO1uUhJJmF9MsWL2dj2gDk1RZjkOlfcAapypl', +} +} +} + # FIXME: not an admin. This is more like a system account. class l10nupdate inherits baseaccount { $username = "l10nupdate" @@ -3532,6 +3552,7 @@ include accounts::ironholds # RT 5935 include accounts::nuria # RT 6535 include accounts::csalvia # RT 6664 + include accounts::leila # RT 6765 } class admins::labs { @@ -3590,7 +3611,8 @@ accounts::qchris, # RT 5474 accounts::tnegrin, # RT 5391 accounts::nuria,# RT 6617 - accounts::csalvia # RT 6664 + accounts::csalvia, # RT 6664 + accounts::leila # RT 6765 } class admins::fr-tech { diff --git a/manifests/site.pp b/manifests/site.pp index 4aef642..21fa38f 100644 --- a/manifests/site.pp +++ b/manifests/site.pp @@ -2278,7 +2278,8 @@ accounts::mholmquist,# RT 6009 accounts::msyed, # RT 6506 accounts::nuria, # RT 6525 -accounts::csalvia# RT 6664 +accounts::csalvia, # RT 6664 +accounts::leila # RT 6765 sudo_user { "otto": privileges => ['ALL = NOPASSWD: ALL'] } -- To view, visit https://gerrit.wikimedia.org/r/112497 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I60633e881d7b497a12ebaf95ffc348f27ce0ff87 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot <> ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] locke: decom, remove udp2log filters - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: locke: decom, remove udp2log filters .. locke: decom, remove udp2log filters RT #6168 Change-Id: I3b6e0dbe6950b43d8dff7c5fa97fa6124658fb31 --- D templates/udp2log/filters.locke.erb 1 file changed, 0 insertions(+), 150 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/templates/udp2log/filters.locke.erb b/templates/udp2log/filters.locke.erb deleted file mode 100644 index 233adef..000 --- a/templates/udp2log/filters.locke.erb +++ /dev/null @@ -1,150 +0,0 @@ -### -This file managed by puppet. -### - -# udp2log packet loss monitoring -pipe 10 /usr/bin/packet-loss 10 '\t' >> /a/squid/packet-loss.log - -## Fundraising -# Landing pages -#pipe 1 /usr/bin/udp-filter -F '\t' -d wikimediafoundation.org,donate.wikimedia.org >> /a/squid/fundraising/logs/landingpages.tab.log - -# Banner Impressions -#pipe 100 /usr/bin/udp-filter -F '\t' -p Special:RecordImpression\?banner=,Special:RecordImpression\?result= >> /a/squid/fundraising/logs/bannerImpressions-sampled100.tab.log - - - - - - - - - - - MOVED TO GADOLINIUM: - -# 0.001 of all udp2log messages -# This log file is also on emery for redundancy -#file 1000 /a/squid/sampled-1000.tab.log - -# All edits -#pipe 1 /usr/bin/udp-filter -F '\t' -p action=submit,action=edit >> /a/squid/edits.tab.log - - -# domas' stuff. -# (This looks like a bunch of C to filter for mobile pages -# and output things by language.) -#pipe 1 /a/webstats/bin/filter | log2udp -h 127.0.0.1 -p 3815 - -# Mobile traffic filter -#pipe 100 /usr/bin/udp-filter -F '\t' -d "m.wikipedia.org" >> /a/squid/mobile.tab.log - -# All 5xx error responses -- domas -# TODO: /usr/bin/udp-filter -F '\t' -r -s '^5' | awk -W interactive '$9 !~ "upload.wikimedia.org|query.php"' >> /a/squid/5xx.log -#pipe 1 /a/squid/5xx-filter | awk -W interactive '$9 !~ "upload.wikimedia.org|query.php"' >> /a/squid/5xx.tab.log - - - - -# Vrije Universiteit -# Contact: <%= scope.lookupvar('contacts::udp2log::vrije_universiteit_contact') %> -# pipe 10 awk -f /a/squid/vu.awk | log2udp -h 130.37.198.252 -p - - -# University of Minnesota -# Contact: <%= scope.lookupvar('contacts::udp2log::university_minnesota_contact') %> -# Former Contact: <%= scope.lookupvar('contacts::udp2log::university_minnesota_contact_former') %> -# Former contact: <%= scope.lookupvar('contacts::udp2log::university_minnesota_contact_former2') %> -# pipe 10 awk -f /a/squid/minnesota.awk | log2udp -h bento.cs.umn.edu -p - - - - - - - -# DISABLED FILTERS - - -# Monitor packet loss -- Tim -# Remove mobile log entries because they have broken sequence numbers -#pipe 10 grep -v '^mobile' | /usr/local/bin/packet-loss 10 >> /a/squid/packet-loss.log - - -#for book extension (data moved to hume then pdf1 by file_mover daily) -# disabled by ben 2011-10-25 due to load -#pipe 1 /a/squid/book-filter >> /a/squid/book-log.log - -#for mobile traffic -#pipe 1 /a/squid/wap-filter >> /a/squid/mobile.log - -# for debugging -#pipe 1 /a/webstats/bin/filter > /dev/null - -#for testing -#file 1000 /home/file_mover/test/bannerImpressions.log - -# Account creation/signup stats --nimish -# DISABLED -nimish- 11/19 -#pipe 1 /a/squid/acctcreation/ac-filter >> /a/squid/acctcreation.log - -# Universidad Rey Juan Carlos -# Contact: <%= scope.lookupvar('contacts::udp2log::universidad_rey_juan_carlos_contact') %> -# Backup contact: <%= scope.lookupvar('contacts::udp2log::universidad_rey_juan_carlos_backup_contact') %> -# Disabled by ottomata on April 30, 2012. -# This filter is flapping and not stable. Not sure why. -# Disabling and emailing <%= scope.lookupvar('contacts::udp2log::universidad_rey_juan_carlos_contact') %>. -#pipe 100 awk -f /a/squid/urjc.awk | log2udp -h wikilogs.libresoft.es -p 10514 - - -# Investigating watchlistr.com -- TS -#pipe 1 awk '$5 == "208.94.116.204" {print $0}' > /a/squid/watchlistr.log - -#pipe 1 awk '$9 ~ "/w/api\.php.*action=login" {print $0}' >> /a/squid/api.log - -# Investigate who's using up 1 Gbps of bandwidth all the time -# DISABLED -nimish- 11/19 -#pipe 1 awk '$7 > 1000 {print $0}' | geoiplogtag 5 >> /a/squid/large-requests.log - - -# All requetsts for [[Special:Book]] -#pipe 1 awk '$9 ~ "/wiki/Special:Book" { print $0 }' >> /a/squid/special-book.log - -# Logging Support requests -- fred -#pipe 1 awk -f /a/squid/support.awk >> /a/squid/support-requests.log - - - -# Find redirects and purge them (TEMP) -# pipe 1 awk '$6 ~ "/301$" && ( $9 ~ "/wiki/." || $9 ~ "/w/index\.php\?" ) { print $9 }' | tee /a/squid/self-redirects.log | php /root/purgeListStandalone.php - -# Remote loader investigation (TEMP) -#pipe 1 awk '$5 == "84.45.45.135" {print $0}' >> /a/squid/wikigalore.log - -# (TEMP) Vector migration -# pipe 1 awk '$9 ~ "^http://(en|es|ru|pl|pt|de|nl|fr|ja|it|commons)\.wiki[pm]edia\.org" { print $9 }' | python
[MediaWiki-commits] [Gerrit] locke: decom, remove logging doc comments - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: locke: decom, remove logging doc comments .. locke: decom, remove logging doc comments RT #6168 Change-Id: I5c9b75ecc14582bebb040a86087d7577e7b89801 --- M manifests/role/logging.pp 1 file changed, 1 insertion(+), 5 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/role/logging.pp b/manifests/role/logging.pp index ef75791..46f8309 100644 --- a/manifests/role/logging.pp +++ b/manifests/role/logging.pp @@ -1,6 +1,6 @@ # logging (udp2log) servers -# base node definition from which logging nodes (emery, locke, oxygen, etc) +# base node definition from which logging nodes (emery, oxygen, etc) # inherit. Note that there is no real node named "base_analytics_logging_node". # This is done as a base node primarily so that we can override the # $nagios_contact_group variable. @@ -221,8 +221,6 @@ } # Gzip pagecounts files hourly. -# This originally lived as an unpuppetized -# cron on locke that ran /a/webstats/scripts/tar. cron { 'webstats-dumps-gzip': command => "/bin/gzip ${webstats_dumps_directory}/pagecounts--?? 2> /dev/null", minute => 2, @@ -231,8 +229,6 @@ } # Delete webstats dumps that are older than 10 days daily. -# This originally lived as an unpuppetized -# cron on locke that ran /a/webstats/scripts/purge. cron { 'webstats-dumps-delete': command => "/usr/bin/find ${webstats_dumps_directory} -maxdepth 1 -type f -mtime +10 -delete", minute => 28, -- To view, visit https://gerrit.wikimedia.org/r/112593 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I5c9b75ecc14582bebb040a86087d7577e7b89801 Gerrit-PatchSet: 2 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Matanya Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot <> ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Continue on single repo failures for deployment server init - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Continue on single repo failures for deployment server init .. Continue on single repo failures for deployment server init Rather than failing immediately if a single repo is broken (for instance if the upstream is listed incorrectly), continue on to other non-broken repos. Change-Id: I27add880f7f4b864f7357a298d42cd5c8bd7e8cd --- M modules/deployment/files/modules/deploy.py 1 file changed, 7 insertions(+), 4 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/modules/deployment/files/modules/deploy.py b/modules/deployment/files/modules/deploy.py index c024c07..6ea62cb 100644 --- a/modules/deployment/files/modules/deploy.py +++ b/modules/deployment/files/modules/deploy.py @@ -90,11 +90,12 @@ def deployment_server_init(): +ret_status = 0 serv = _get_redis_serv() is_deployment_server = __grains__.get('deployment_server') hook_dir = __grains__.get('deployment_global_hook_dir') if not is_deployment_server: -return 0 +return ret_status deploy_user = __grains__.get('deployment_repo_user') repo_config = __pillar__.get('repo_config') for repo in repo_config: @@ -120,7 +121,8 @@ status = __salt__['cmd.retcode'](cmd, runas=deploy_user, umask=002) if status != 0: -return status +ret_status = 1 +continue # git clone does ignores umask and does explicit mkdir with 755 __salt__['file.set_mode'](config['location'], 2775) # Set the repo name in the repo's config @@ -128,8 +130,9 @@ status = __salt__['cmd.retcode'](cmd, cwd=config['location'], runas=deploy_user, umask=002) if status != 0: -return status -return 0 +ret_status = 1 +continue +return ret_status def sync_all(): -- To view, visit https://gerrit.wikimedia.org/r/112317 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I27add880f7f4b864f7357a298d42cd5c8bd7e8cd Gerrit-PatchSet: 2 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ryan Lane Gerrit-Reviewer: BryanDavis Gerrit-Reviewer: Ori.livneh Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot <> ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Ensure submodules are checked out on the deployment server - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Ensure submodules are checked out on the deployment server .. Ensure submodules are checked out on the deployment server If checkout_submodules is enabled on minions at any point and the deployment server doesn't have the submodules checked out then the minions will enter an unrecoverable state after the initial deploy. This change always does a submodule recursive init for repos to ensure this condition can't happen. Change-Id: I125e9275d813dc34baed10878c19be94e4a0251e --- M modules/deployment/files/modules/deploy.py 1 file changed, 17 insertions(+), 2 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/modules/deployment/files/modules/deploy.py b/modules/deployment/files/modules/deploy.py index 6ea62cb..21ad4d1 100644 --- a/modules/deployment/files/modules/deploy.py +++ b/modules/deployment/files/modules/deploy.py @@ -116,10 +116,25 @@ if config['upstream']: cmd = '/usr/bin/git clone %s/.git %s' % (config['upstream'], config['location']) +status = __salt__['cmd.retcode'](cmd, runas=deploy_user, + umask=002) +if status != 0: +ret_status = 1 +continue +# We don't check the checkout_submodules config flag here +# on purpose. The deployment server should always have a +# fully recursive clone and minions should decide whether +# or not they'll use the submodules. This avoids consistency +# issues in the case where submodules are later enabled, but +# someone forgets to check them out. +cmd = '/usr/bin/git submodule update --init --recursive' +status = __salt__['cmd.retcode'](cmd, runas=deploy_user, + umask=002, + cwd=config['location']) else: cmd = '/usr/bin/git init %s' % (config['location']) -status = __salt__['cmd.retcode'](cmd, runas=deploy_user, - umask=002) +status = __salt__['cmd.retcode'](cmd, runas=deploy_user, + umask=002) if status != 0: ret_status = 1 continue -- To view, visit https://gerrit.wikimedia.org/r/112319 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I125e9275d813dc34baed10878c19be94e4a0251e Gerrit-PatchSet: 3 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ryan Lane Gerrit-Reviewer: BryanDavis Gerrit-Reviewer: Ori.livneh Gerrit-Reviewer: Ottomata Gerrit-Reviewer: Ryan Lane Gerrit-Reviewer: jenkins-bot <> ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Add Sphinx function documentation for deploy module - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Add Sphinx function documentation for deploy module .. Add Sphinx function documentation for deploy module This change adds basic Sphinx documentation to the deploy module. Change-Id: Ia8f04c8bc95ef521192e1ea9fa1a176b4bd58b8a --- M modules/deployment/files/modules/deploy.py 1 file changed, 152 insertions(+), 9 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/modules/deployment/files/modules/deploy.py b/modules/deployment/files/modules/deploy.py index 21ad4d1..d3ab312 100644 --- a/modules/deployment/files/modules/deploy.py +++ b/modules/deployment/files/modules/deploy.py @@ -12,6 +12,8 @@ def _get_redis_serv(): ''' Return a redis server object + +:rtype: A Redis object ''' deployment_config = __pillar__.get('deployment_config') deploy_redis = deployment_config['redis'] @@ -22,6 +24,16 @@ def _check_in(function, repo): +""" +Private function used for reporting that a function has started. +Writes to redis with basic status information. + +:param function: The function being reported on. +:type function: str +:param repo: The repository being acted on. +:type repo: str +:rtype: None +""" serv = _get_redis_serv() minion = __grains__.get('id') timestamp = time.time() @@ -39,6 +51,16 @@ def _map_args(repo, args): +""" +Maps a set of arguments to a predefined set of values. Currently only +__REPO__ is support and will be replaced with the repository name. + +:param repo: The repo name used for mapping. +:type repo: str +:param args: An array of arguments to map. +:type args: list +:rtype: list +""" arg_map = {'__REPO__': repo} mapped_args = [] for arg in args: @@ -47,6 +69,14 @@ def get_config(repo): +""" +Fetches the configuration for this repo from the pillars and returns +a hash with the munged configuration (with defaults and helper config). + +:param repo: The specific repo for which to return config data. +:type repo: str +:rtype: hash +""" deployment_config = __pillar__.get('deployment_config') config = __pillar__.get('repo_config') config = config[repo] @@ -90,6 +120,14 @@ def deployment_server_init(): +""" +Initializes a set of repositories on the deployment server. This +function will only run on the deployment server and will initialize +any repository defined in the pillar configuration. This function is +safe to call at any point. + +:rtype: int +""" ret_status = 0 serv = _get_redis_serv() is_deployment_server = __grains__.get('deployment_server') @@ -152,11 +190,20 @@ def sync_all(): ''' -Sync all repositories. If a repo doesn't exist on target, clone as well. +Sync all repositories for this minion. If a repo doesn't exist on target, +clone it as well. This function will ensure all repositories for the +minion are at the current tag as defined by the master and is +be safe to call at any point. -CLI Example:: +CLI Example (from the master): -salt -G 'cluster:appservers' deploy.sync_all +salt -G 'deployment_target:test' deploy.sync_all + +CLI Example (from a minion): + +salt-call deploy.sync_all + +:rtype: hash ''' repo_config = __pillar__.get('repo_config') deployment_target = __grains__.get('deployment_target') @@ -175,6 +222,23 @@ def _update_gitmodules(config, location, shadow=False): +""" +Finds all .gitmodules in a repository, changes all submodules within them +to point to the correct submodule on the deployment server, then runs +a submodule sync. This function is in support of recursive submodules. + +In the case we need to update a shadow reference repo, the .gitmodules +files will have their submodules point to the reference clone. + +:param config: The config hash for the repo (as pulled from get_config). +:type config: hash +:param location: The location on the filesystem to find the .gitmodules + files. +:type location: str +:param shadow: Defines whether or not this is a shadow reference repo. +:type shadow: bool +:rtype: int +""" gitmodules_list = __salt__['file.find'](location, name='.gitmodules') for gitmodules in gitmodules_list: gitmodules_dir = os.path.dirname(gitmodules) @@ -221,6 +285,21 @@ def _clone(config, location, tag, shadow=False): +""" +Perform a clone of a repo at a specified location, and +do a fetch and checkout of the repo to ensure it's at the +current deployment tag. + +:param config: Config hash as fetched from get_config +:type config: hash +:param location: The location on the filesystem to clone this repo. +:type location: st
[MediaWiki-commits] [Gerrit] [DO NOT MERGE] Setting batch_num_messages to 6000 - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: [DO NOT MERGE] Setting batch_num_messages to 6000 .. [DO NOT MERGE] Setting batch_num_messages to 6000 Don't merge this yet. Magnus and are are trying to catch cp3019 misbehaving again, and we think that this setting might fix it. Change-Id: I2243fc06f410a5bdf8a8ce489e3a3b1fe16ecb2c --- M manifests/role/cache.pp M modules/varnishkafka 2 files changed, 4 insertions(+), 0 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/role/cache.pp b/manifests/role/cache.pp index 70bd938..b2535bd 100644 --- a/manifests/role/cache.pp +++ b/manifests/role/cache.pp @@ -429,6 +429,10 @@ format => "%{fake_tag0@hostname?${::fqdn}}x %{@sequence!num?0}n %{%FT%T@dt}t %{Varnish:time_firstbyte@time_firstbyte!num?0.0}x %{@ip}h %{Varnish:handling@cache_status}x %{@http_status}s %{@response_size!num?0}b %{@http_method}m %{Host@uri_host}i %{@uri_path}U %{@uri_query}q %{Content-Type@content_type}o %{Referer@referer}i %{X-Forwarded-For@x_forwarded_for}i %{User-Agent@user_agent}i %{Accept-Language@accept_language}i %{X-Analytics@x_analytics}o", message_send_max_retries => 3, queue_buffering_max_messages => 200, +# bits varnishes do about 6000 reqs / sec each. +# We want to buffer for about max 1 second. +batch_num_messages => 6000, + # large timeout to account for potential cross DC latencies topic_request_timeout_ms => 3, # request ack timeout # Write out stats to varnishkafka.stats.json diff --git a/modules/varnishkafka b/modules/varnishkafka index bab034a..04ece4f 16 --- a/modules/varnishkafka +++ b/modules/varnishkafka -Subproject commit bab034a007f665303bc3c3fad48e34eccbb3648c +Subproject commit 04ece4fb9de20929eaaad0c7a26100d424db22e4 -- To view, visit https://gerrit.wikimedia.org/r/111523 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I2243fc06f410a5bdf8a8ce489e3a3b1fe16ecb2c Gerrit-PatchSet: 2 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot <> ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Putting analytics users on analytics1027 so they can ssh tun... - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Putting analytics users on analytics1027 so they can ssh tunnel to hue .. Putting analytics users on analytics1027 so they can ssh tunnel to hue Change-Id: Ib59eae1617e9b102bf7739e5788595fffda55330 --- M manifests/site.pp 1 file changed, 2 insertions(+), 0 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/site.pp b/manifests/site.pp index 30093dc..644c8a3 100644 --- a/manifests/site.pp +++ b/manifests/site.pp @@ -171,6 +171,8 @@ # interfaces to Kraken and Hadoop. # (Hue, Oozie, Hive, etc.) node "analytics1027.eqiad.wmnet" { +include role::analytics::users + include role::analytics::clients include role::analytics::hive::server include role::analytics::oozie::server -- To view, visit https://gerrit.wikimedia.org/r/112937 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ib59eae1617e9b102bf7739e5788595fffda55330 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Putting analytics users on analytics1027 so they can ssh tun... - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/112937 Change subject: Putting analytics users on analytics1027 so they can ssh tunnel to hue .. Putting analytics users on analytics1027 so they can ssh tunnel to hue Change-Id: Ib59eae1617e9b102bf7739e5788595fffda55330 --- M manifests/site.pp 1 file changed, 2 insertions(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/37/112937/1 diff --git a/manifests/site.pp b/manifests/site.pp index 30093dc..644c8a3 100644 --- a/manifests/site.pp +++ b/manifests/site.pp @@ -171,6 +171,8 @@ # interfaces to Kraken and Hadoop. # (Hue, Oozie, Hive, etc.) node "analytics1027.eqiad.wmnet" { +include role::analytics::users + include role::analytics::clients include role::analytics::hive::server include role::analytics::oozie::server -- To view, visit https://gerrit.wikimedia.org/r/112937 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ib59eae1617e9b102bf7739e5788595fffda55330 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding git-fat support for trebuchet deployment - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/112944 Change subject: Adding git-fat support for trebuchet deployment .. Adding git-fat support for trebuchet deployment Change-Id: Ib5338575a0240825a87baed9adaa1baafa287754 --- M modules/deployment/files/modules/deploy.py 1 file changed, 52 insertions(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/44/112944/1 diff --git a/modules/deployment/files/modules/deploy.py b/modules/deployment/files/modules/deploy.py index d3ab312..b0dd0a6 100644 --- a/modules/deployment/files/modules/deploy.py +++ b/modules/deployment/files/modules/deploy.py @@ -109,6 +109,7 @@ scheme = 'http' config['url'] = '{0}://{1}/{2}'.format(scheme, server, repo) config.setdefault('checkout_submodules', False) +config.setdefault('gitfat_enabled', False) config.setdefault('dependencies', {}) config.setdefault('checkout_module_calls', {}) config.setdefault('fetch_module_calls', {}) @@ -282,6 +283,50 @@ if status != 0: return status return 0 + + +def _gitfat_installed(): +cmd = 'which git-fat' +return __salt__['cmd.retcode'](cmd) + +def _init_gitfat(location): +''' +Runs git fat init at this location. + +:param location: The location on the filesystem to run git fat init +:type location: str +''' +# if it isn't then initialize it now +cmd = '/usr/bin/git fat init' +return __salt__['cmd.retcode'](cmd, location) + + +# TODO: git fat gc? +def _update_gitfat(location): +''' +Runs git-fat pull at this location. +If git fat has not been initialized for the +repository at this location, _init_gitfat +will be called first. + +:param location: The location on the filesystem to run git fat pull +:type location: str +''' + +# Make sure git fat is installed. +if _gitfat_installed() != 0: +return 40 # TODO: I just made this retval up, what should this be? + +# Make sure git fat is initialized. +cmd = '/usr/bin/git config --get filter.fat.smudge' +if __salt__['cmd.run'](cmd, location) != 'git-fat filter-smudge': +status = _init_gitfat(location) +if status != 0: +return status + +# Run git fat pull. +cmd = '/usr/bin/git fat pull' +return __salt__['cmd.retcode'](cmd, location) def _clone(config, location, tag, shadow=False): @@ -552,6 +597,13 @@ ret = __salt__['cmd.retcode'](cmd, location) if ret != 0: return 50 + +# Trigger git-fat pull if gitfat_enabled +if config['gitfat_enabled']: +ret = _update_gitfat(location) +if ret != 0: +return ret + return 0 -- To view, visit https://gerrit.wikimedia.org/r/112944 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ib5338575a0240825a87baed9adaa1baafa287754 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Output correct Hue tunnel address on console - change (analytics/kraken)
Ottomata has submitted this change and it was merged. Change subject: Output correct Hue tunnel address on console .. Output correct Hue tunnel address on console Change-Id: I3920274a992e9dbe16b594518bc162c2c5d1fff3 --- M bin/ktunnel 1 file changed, 12 insertions(+), 11 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/bin/ktunnel b/bin/ktunnel index d33a400..a897bdb 100755 --- a/bin/ktunnel +++ b/bin/ktunnel @@ -41,15 +41,15 @@ default_ssh_address = 'bast1001.wikimedia.org' services = { -'namenode': {'host': 'analytics1010.eqiad.wmnet', 'port': 50070}, -'datanode': {'host': 'analytics1011.eqiad.wmnet', 'port': 50075}, -'jobs': {'host': 'analytics1010.eqiad.wmnet', 'port': 8088 }, -'history': {'host': 'analytics1010.eqiad.wmnet', 'port': 19888}, -'hue': {'host': 'analytics1027.eqiad.wmnet', 'port': }, -'oozie':{'host': 'analytics1027.eqiad.wmnet', 'port': 11000}, -'logs': {'host': 'analytics1020.eqiad.wmnet', 'port': 8042}, -'j':{'host': 'analytics1010.eqiad.wmnet', 'port': 50030}, -'proxy':{'host': 'analytics1010.eqiad.wmnet', 'port': 8999}, +'namenode': {'host': 'analytics1010.eqiad.wmnet', 'port': 50070, 'https': False}, +'datanode': {'host': 'analytics1011.eqiad.wmnet', 'port': 50075, 'https': False}, +'jobs': {'host': 'analytics1010.eqiad.wmnet', 'port': 8088, 'https': False }, +'history': {'host': 'analytics1010.eqiad.wmnet', 'port': 19888, 'https': False}, +'hue': {'host': 'analytics1027.eqiad.wmnet', 'port': , 'https': True }, +'oozie':{'host': 'analytics1027.eqiad.wmnet', 'port': 11000, 'https': False}, +'logs': {'host': 'analytics1020.eqiad.wmnet', 'port': 8042, 'https': False}, +'j':{'host': 'analytics1010.eqiad.wmnet', 'port': 50030, 'https': False}, +'proxy':{'host': 'analytics1010.eqiad.wmnet', 'port': 8999, 'https': False}, } def tunnel(ssh_host, bind_port, dest_host, dest_port, background=False, verbose=True): @@ -82,6 +82,7 @@ print('Starting SOCKS proxy through %s:%i') proxy(service['host'], service['port'], arguments['--background'], arguments['--verbose']) else: -print("Tunneling to %s. Tunnel will be available at http://localhost:%s\n"; % (arguments[''], service['port'])) +protocol = 'https' if service['https'] == True else 'http' +print("Tunneling to %s. Tunnel will be available at %s://localhost:%s\n" % (arguments[''], protocol, service['port'])) tunnel(ssh_host, service['port'], service['host'], service['port'], arguments['--background'], arguments['--verbose']) - \ No newline at end of file + -- To view, visit https://gerrit.wikimedia.org/r/113013 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I3920274a992e9dbe16b594518bc162c2c5d1fff3 Gerrit-PatchSet: 1 Gerrit-Project: analytics/kraken Gerrit-Branch: master Gerrit-Owner: Diederik Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Initial debian release - change (operations...git-fat)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/113018 Change subject: Initial debian release .. Initial debian release Change-Id: Icecaffdcedf39e2aa3c2d8e9b0adc0cd36d1bcdd --- A debian/changelog A debian/compat A debian/control A debian/copyright A debian/gbp.conf A debian/install A debian/rules 7 files changed, 91 insertions(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/debs/git-fat refs/changes/18/113018/1 diff --git a/debian/changelog b/debian/changelog new file mode 100644 index 000..818d864 --- /dev/null +++ b/debian/changelog @@ -0,0 +1,5 @@ +git-fat (0.1.0-1) precise-wikimedia; urgency=low + + * Initial package + + -- Andrew Otto (WMF) Wed, 12 Feb 2014 21:14:02 + diff --git a/debian/compat b/debian/compat new file mode 100644 index 000..ec63514 --- /dev/null +++ b/debian/compat @@ -0,0 +1 @@ +9 diff --git a/debian/control b/debian/control new file mode 100644 index 000..b143e35 --- /dev/null +++ b/debian/control @@ -0,0 +1,32 @@ +Source: git-fat +Maintainer: Andrew Otto (WMF) +Section: net +Priority: extra +Build-Depends: python, debhelper (>= 9) +Standards-Version: 3.9.5 +Vcs-Git: https://github.com/jedbrown/git-fat.git +Vcs-Browser: https://github.com/jedbrown/git-fat + +Package: git-fat +Architecture: all +Depends: ${misc:Depends}, ${python:Depends}, git-core +X-Python-Version: >= 2.6 +Provides: ${python:Provides} +Description: Manage large files with git, without checking the files into git. + git-fat maintains metadata about locations of files without checking the files + into git. It relies on a configurable centralized rsync location to sync files + between checkouts of git repositories. + . + Features: + . + - clones of the source repository are small and fast because no + binaries are transferred, yet fully functional (unlike git clone --depth) + - git-fat supports the same workflow for large binaries and traditionally + versioned files, but internally manages the "fat" files separately + - git-bisect works properly even when versions of the binary files change + over time + - selective control of which large files to pull into the local store + - local fat object stores can be shared between multiple clones, even by + different users + - can easily support fat object stores distributed across multiple hosts + - depends only on stock Python and rsync diff --git a/debian/copyright b/debian/copyright new file mode 100644 index 000..18e3220 --- /dev/null +++ b/debian/copyright @@ -0,0 +1,34 @@ +Format: http://www.debian.org/doc/packaging-manuals/copyright-format/1.0/ +Upstream-Name: git-fat +Source: https://github.com/jedbrown/git-fat + +Files: * +Copyright: 2012 Jed Brown +License: BSD-2-clause + +Files: debian/* +Copyright: 2014 Andrew Otto (WMF) + 2014 Wikimedia Foundation, Inc. +License: BSD-2-clause + +License: BSD-2-clause + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions are met: + . + 1. Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + 2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + . + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE + LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR + CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS + INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN + CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + POSSIBILITY OF SUCH DAMAGE. diff --git a/debian/gbp.conf b/debian/gbp.conf new file mode 100644 index 000..83c3d4a --- /dev/null +++ b/debian/gbp.conf @@ -0,0 +1,6 @@ +[git-buildpackage] +upstream-tree=branch +upstream-branch=master +debian-branch=debian +tarball-dir = ../tarballs +export-dir = ../build-area diff --git a/debian/install b/debian/install new file mode 100644 index 000..a20e522 --- /dev/null +++ b/debian/install @@ -0,0 +1 @@ +git-fat /usr/lib/git-core diff --git a/debian/rules b/debian/rules new file mode 100755 index 000..48a82ea --- /dev/null +++ b/debian/rules @@ -0,0 +1,12 @@ +#!/usr/bin/make -f + +# Uncomment this to turn on verbose mode. +#export DH_VERBOSE=1 + +%: + dh $@ --wi
[MediaWiki-commits] [Gerrit] Enable ResourceManager API in hue.ini to enable JobHistory i... - change (operations...cdh4)
Ottomata has submitted this change and it was merged. Change subject: Enable ResourceManager API in hue.ini to enable JobHistory in Hue. .. Enable ResourceManager API in hue.ini to enable JobHistory in Hue. Change-Id: I3f7d244ec319d5d287c579a69237300a9379d835 --- M templates/hue/hue.ini.erb 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/templates/hue/hue.ini.erb b/templates/hue/hue.ini.erb index 2762ca7..971753f 100644 --- a/templates/hue/hue.ini.erb +++ b/templates/hue/hue.ini.erb @@ -300,7 +300,7 @@ ## hadoop_conf_dir=/etc/hadoop/conf # URL of the ResourceManager API - ## resourcemanager_api_url=http://<%= @namenode_host %>:8088 + resourcemanager_api_url=http://<%= @namenode_host %>:8088 # URL of the ProxyServer API proxy_api_url=http://<%= @namenode_host %>:8088 -- To view, visit https://gerrit.wikimedia.org/r/113021 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I3f7d244ec319d5d287c579a69237300a9379d835 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet/cdh4 Gerrit-Branch: master Gerrit-Owner: Diederik Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot <> ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Updating cdh4 submodule with hue jobbrowser fix - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Updating cdh4 submodule with hue jobbrowser fix .. Updating cdh4 submodule with hue jobbrowser fix Change-Id: I2966d05d873f79852d2da748588600e1347e518a --- M modules/cdh4 1 file changed, 0 insertions(+), 0 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/modules/cdh4 b/modules/cdh4 index 7095c8b..d7a2589 16 --- a/modules/cdh4 +++ b/modules/cdh4 -Subproject commit 7095c8b51e16f14f9b88166da8a5f90a5f887063 +Subproject commit d7a258934473f9753cb275c8840f3bbcb8e9a7dc -- To view, visit https://gerrit.wikimedia.org/r/113030 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I2966d05d873f79852d2da748588600e1347e518a Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Diederik Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot <> ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Fixing public-datasets rsync job on stat1001 - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Fixing public-datasets rsync job on stat1001 .. Fixing public-datasets rsync job on stat1001 Change-Id: If749f3b1eb16fe0b05665e1183a341df676b4034 --- M manifests/misc/statistics.pp 1 file changed, 2 insertions(+), 1 deletion(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/misc/statistics.pp b/manifests/misc/statistics.pp index 206793c..cfe11fa 100644 --- a/manifests/misc/statistics.pp +++ b/manifests/misc/statistics.pp @@ -194,8 +194,9 @@ mode => '0640', } +# rsync from stat1:/a/public-datasets to /var/www/public-datasets/ cron { 'rsync public datasets': - command => '/usr/bin/rsync -rt stat1.wikimedia.org::public-datasets/* /var/www/public-datasets/', + command => '/usr/bin/rsync -rt stat1.wikimedia.org::a/public-datasets/* /var/www/public-datasets/', require => File['/var/www/public-datasets'], user=> 'root', hour=> '*', -- To view, visit https://gerrit.wikimedia.org/r/67355 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: If749f3b1eb16fe0b05665e1183a341df676b4034 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Fixing public-datasets rsync job on stat1001 - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/67355 Change subject: Fixing public-datasets rsync job on stat1001 .. Fixing public-datasets rsync job on stat1001 Change-Id: If749f3b1eb16fe0b05665e1183a341df676b4034 --- M manifests/misc/statistics.pp 1 file changed, 2 insertions(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/55/67355/1 diff --git a/manifests/misc/statistics.pp b/manifests/misc/statistics.pp index 206793c..cfe11fa 100644 --- a/manifests/misc/statistics.pp +++ b/manifests/misc/statistics.pp @@ -194,8 +194,9 @@ mode => '0640', } +# rsync from stat1:/a/public-datasets to /var/www/public-datasets/ cron { 'rsync public datasets': - command => '/usr/bin/rsync -rt stat1.wikimedia.org::public-datasets/* /var/www/public-datasets/', + command => '/usr/bin/rsync -rt stat1.wikimedia.org::a/public-datasets/* /var/www/public-datasets/', require => File['/var/www/public-datasets'], user=> 'root', hour=> '*', -- To view, visit https://gerrit.wikimedia.org/r/67355 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: If749f3b1eb16fe0b05665e1183a341df676b4034 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding support for ganglia metrics in hadoop-metrics2.proper... - change (operations...cdh4)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/67419 Change subject: Adding support for ganglia metrics in hadoop-metrics2.properties .. Adding support for ganglia metrics in hadoop-metrics2.properties Change-Id: I547a9c0faaeb2d937c506f57a271230c4fe64c06 --- M TODO.md M manifests/hadoop.pp M manifests/hadoop/defaults.pp A templates/hadoop/hadoop-metrics2.properties.erb 4 files changed, 62 insertions(+), 2 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet/cdh4 refs/changes/19/67419/1 diff --git a/TODO.md b/TODO.md index 9dac694..04e2dd2 100644 --- a/TODO.md +++ b/TODO.md @@ -2,7 +2,6 @@ ## Hadoop -- Add hadoop-metrics2.properties configuration - Add hosts.exclude support for decommissioning nodes. - Change cluster (conf) name? (use update-alternatives?) - Set default map/reduce tasks automatically based on facter node stats. @@ -12,6 +11,7 @@ - Support Secondary NameNode. - Support High Availability NameNode. - Make JMX ports configurable. +- Make hadoop-metrics2.properties more configurable. ## Hive - Hive Server + Hive Metastore @@ -24,3 +24,6 @@ ## Zookeeper +Won't implement. A Zookeeper package is available upstream in Debian/Ubuntu. +Puppetization for this package can be found at +https://github.com/wikimedia/operations-puppet-zookeeper diff --git a/manifests/hadoop.pp b/manifests/hadoop.pp index 49562a0..e974daf 100644 --- a/manifests/hadoop.pp +++ b/manifests/hadoop.pp @@ -37,6 +37,7 @@ # $yarn_nodemanager_resource_memory_mb # $yarn_resourcemanager_scheduler_class - If you change this (e.g. to FairScheduler), you should also provide your own scheduler config .xml files outside of the cdh4 module. # $use_yarn +# $ganglia_hosts- Set this to an array of ganglia host:ports if you want to enable ganglia sinks in hadoop-metrics2.properites # class cdh4::hadoop( $namenode_hostname, @@ -64,7 +65,8 @@ $mapreduce_final_compession = $::cdh4::hadoop::defaults::mapreduce_final_compession, $yarn_nodemanager_resource_memory_mb = $::cdh4::hadoop::defaults::yarn_nodemanager_resource_memory_mb, $yarn_resourcemanager_scheduler_class= $::cdh4::hadoop::defaults::yarn_resourcemanager_scheduler_class, - $use_yarn= $::cdh4::hadoop::defaults::use_yarn + $use_yarn= $::cdh4::hadoop::defaults::use_yarn, + $ganglia_hosts = $::cdh4::hadoop::defaults::ganglia_hosts, ) inherits cdh4::hadoop::defaults { @@ -129,4 +131,15 @@ ensure => $yarn_ensure, content => template('cdh4/hadoop/yarn-env.sh.erb'), } + + # render hadoop-metrics2.properties + # if we hav Ganglia Hosts to send metrics to. + $hadoop_metrics2_ensure = $ganglia_hosts ? { + undef => 'absent', + default => 'present', + } + file { "${config_directory}/hadoop-metrics2.properties": + ensure => $hadoop_metrics2_ensure, + content => template('cdh4/hadoop/hadoop-metrics2.properties.erb'), + } } diff --git a/manifests/hadoop/defaults.pp b/manifests/hadoop/defaults.pp index 3fc6c5d..373ca8e 100644 --- a/manifests/hadoop/defaults.pp +++ b/manifests/hadoop/defaults.pp @@ -26,4 +26,5 @@ $yarn_nodemanager_resource_memory_mb = undef $yarn_resourcemanager_scheduler_class= undef $use_yarn= true + $ganglia_hosts = undef } \ No newline at end of file diff --git a/templates/hadoop/hadoop-metrics2.properties.erb b/templates/hadoop/hadoop-metrics2.properties.erb new file mode 100644 index 000..9075e62 --- /dev/null +++ b/templates/hadoop/hadoop-metrics2.properties.erb @@ -0,0 +1,43 @@ +# NOTE: This file is managed by Puppet. + +# syntax: [prefix].[source|sink].[instance].[options] +# See javadoc of package-info.java for org.apache.hadoop.metrics2 for details + +# default sampling period, in seconds +*.period=10 + +<% if @ganglia_hosts +ganglia_hosts_string = ganglia_hosts.sort.join(',') +-%> +# +# Below are for sending metrics to Ganglia +# + +# for Ganglia 3.1 support +*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31 + +*.sink.ganglia.period=10 + +# default for supportsparse is false +# *.sink.ganglia.supportsparse=true + +*.sink.ganglia.slope=jvm.metrics.gcCount=zero,jvm.metrics.memHeapUsedM=both +*.sink.ganglia.dmax=jvm.metrics.threadsBlocked=70,jvm.metrics.memHeapUsedM=40 + +namenode.sink.ganglia.servers=<%= ganglia_hosts_string %> +datanode.sink.ganglia.servers=<%= ganglia_hosts_string %> + +<% if use_yarn -%> +resourcemanager.sink.ganglia.servers=<%= ganglia_hosts_string %> +nodemanager.sink.ganglia.servers=<%= ganglia_hosts_string %> +<% else -%> +jobtracker.sink.ganglia.servers=<%= ganglia_hosts_string %> +tasktracker.sink.ganglia.servers=<%= ganglia_hosts_string %> +<% end -%> + +maptask.sink.ganglia.servers=
[MediaWiki-commits] [Gerrit] Adding support for ganglia metrics in hadoop-metrics2.proper... - change (operations...cdh4)
Ottomata has submitted this change and it was merged. Change subject: Adding support for ganglia metrics in hadoop-metrics2.properties .. Adding support for ganglia metrics in hadoop-metrics2.properties Change-Id: I547a9c0faaeb2d937c506f57a271230c4fe64c06 --- M TODO.md M manifests/hadoop.pp M manifests/hadoop/defaults.pp A templates/hadoop/hadoop-metrics2.properties.erb 4 files changed, 62 insertions(+), 2 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/TODO.md b/TODO.md index 9dac694..04e2dd2 100644 --- a/TODO.md +++ b/TODO.md @@ -2,7 +2,6 @@ ## Hadoop -- Add hadoop-metrics2.properties configuration - Add hosts.exclude support for decommissioning nodes. - Change cluster (conf) name? (use update-alternatives?) - Set default map/reduce tasks automatically based on facter node stats. @@ -12,6 +11,7 @@ - Support Secondary NameNode. - Support High Availability NameNode. - Make JMX ports configurable. +- Make hadoop-metrics2.properties more configurable. ## Hive - Hive Server + Hive Metastore @@ -24,3 +24,6 @@ ## Zookeeper +Won't implement. A Zookeeper package is available upstream in Debian/Ubuntu. +Puppetization for this package can be found at +https://github.com/wikimedia/operations-puppet-zookeeper diff --git a/manifests/hadoop.pp b/manifests/hadoop.pp index 49562a0..e974daf 100644 --- a/manifests/hadoop.pp +++ b/manifests/hadoop.pp @@ -37,6 +37,7 @@ # $yarn_nodemanager_resource_memory_mb # $yarn_resourcemanager_scheduler_class - If you change this (e.g. to FairScheduler), you should also provide your own scheduler config .xml files outside of the cdh4 module. # $use_yarn +# $ganglia_hosts- Set this to an array of ganglia host:ports if you want to enable ganglia sinks in hadoop-metrics2.properites # class cdh4::hadoop( $namenode_hostname, @@ -64,7 +65,8 @@ $mapreduce_final_compession = $::cdh4::hadoop::defaults::mapreduce_final_compession, $yarn_nodemanager_resource_memory_mb = $::cdh4::hadoop::defaults::yarn_nodemanager_resource_memory_mb, $yarn_resourcemanager_scheduler_class= $::cdh4::hadoop::defaults::yarn_resourcemanager_scheduler_class, - $use_yarn= $::cdh4::hadoop::defaults::use_yarn + $use_yarn= $::cdh4::hadoop::defaults::use_yarn, + $ganglia_hosts = $::cdh4::hadoop::defaults::ganglia_hosts, ) inherits cdh4::hadoop::defaults { @@ -129,4 +131,15 @@ ensure => $yarn_ensure, content => template('cdh4/hadoop/yarn-env.sh.erb'), } + + # render hadoop-metrics2.properties + # if we hav Ganglia Hosts to send metrics to. + $hadoop_metrics2_ensure = $ganglia_hosts ? { + undef => 'absent', + default => 'present', + } + file { "${config_directory}/hadoop-metrics2.properties": + ensure => $hadoop_metrics2_ensure, + content => template('cdh4/hadoop/hadoop-metrics2.properties.erb'), + } } diff --git a/manifests/hadoop/defaults.pp b/manifests/hadoop/defaults.pp index 3fc6c5d..373ca8e 100644 --- a/manifests/hadoop/defaults.pp +++ b/manifests/hadoop/defaults.pp @@ -26,4 +26,5 @@ $yarn_nodemanager_resource_memory_mb = undef $yarn_resourcemanager_scheduler_class= undef $use_yarn= true + $ganglia_hosts = undef } \ No newline at end of file diff --git a/templates/hadoop/hadoop-metrics2.properties.erb b/templates/hadoop/hadoop-metrics2.properties.erb new file mode 100644 index 000..9075e62 --- /dev/null +++ b/templates/hadoop/hadoop-metrics2.properties.erb @@ -0,0 +1,43 @@ +# NOTE: This file is managed by Puppet. + +# syntax: [prefix].[source|sink].[instance].[options] +# See javadoc of package-info.java for org.apache.hadoop.metrics2 for details + +# default sampling period, in seconds +*.period=10 + +<% if @ganglia_hosts +ganglia_hosts_string = ganglia_hosts.sort.join(',') +-%> +# +# Below are for sending metrics to Ganglia +# + +# for Ganglia 3.1 support +*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31 + +*.sink.ganglia.period=10 + +# default for supportsparse is false +# *.sink.ganglia.supportsparse=true + +*.sink.ganglia.slope=jvm.metrics.gcCount=zero,jvm.metrics.memHeapUsedM=both +*.sink.ganglia.dmax=jvm.metrics.threadsBlocked=70,jvm.metrics.memHeapUsedM=40 + +namenode.sink.ganglia.servers=<%= ganglia_hosts_string %> +datanode.sink.ganglia.servers=<%= ganglia_hosts_string %> + +<% if use_yarn -%> +resourcemanager.sink.ganglia.servers=<%= ganglia_hosts_string %> +nodemanager.sink.ganglia.servers=<%= ganglia_hosts_string %> +<% else -%> +jobtracker.sink.ganglia.servers=<%= ganglia_hosts_string %> +tasktracker.sink.ganglia.servers=<%= ganglia_hosts_string %> +<% end -%> + +maptask.sink.ganglia.servers=<%= ganglia_hosts_string %> +reducetask.
[MediaWiki-commits] [Gerrit] Updating CDH4 module, configuring hadoop-metrics2.properties... - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/67424 Change subject: Updating CDH4 module, configuring hadoop-metrics2.properties with ganglia aggreagtor hosts .. Updating CDH4 module, configuring hadoop-metrics2.properties with ganglia aggreagtor hosts Change-Id: I837485d86472969b59e60c239d03e791e07b287b --- M manifests/role/analytics/hadoop.pp M modules/cdh4 2 files changed, 4 insertions(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/24/67424/1 diff --git a/manifests/role/analytics/hadoop.pp b/manifests/role/analytics/hadoop.pp index 2ebbb79..fd3757a 100644 --- a/manifests/role/analytics/hadoop.pp +++ b/manifests/role/analytics/hadoop.pp @@ -87,6 +87,8 @@ mapreduce_task_io_sort_factor => 10, yarn_nodemanager_resource_memory_mb => 40960, yarn_resourcemanager_scheduler_class=> 'org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler', +# TODO: use variables from new ganglia module once it is finished. +ganglia_hosts => ['239.192.1.32:8649'], } file { "$::cdh4::hadoop::config_directory/fair-scheduler.xml": @@ -130,6 +132,8 @@ mapreduce_reduce_tasks_maximum => 2, mapreduce_job_reuse_jvm_num_tasks => 1, yarn_resourcemanager_scheduler_class=> 'org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler', +# TODO: use variables from new ganglia module once it is finished. +ganglia_hosts => ['10.4.0.79:8649'], } file { "$::cdh4::hadoop::config_directory/fair-scheduler.xml": diff --git a/modules/cdh4 b/modules/cdh4 index b5f7b70..02718e3 16 --- a/modules/cdh4 +++ b/modules/cdh4 -Subproject commit b5f7b70b2191528e9df2a81d2361e1a1fdf2469b +Subproject commit 02718e3af97b2b0cddf9f428d32e4904b0389576 -- To view, visit https://gerrit.wikimedia.org/r/67424 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I837485d86472969b59e60c239d03e791e07b287b Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Updating CDH4 module, configuring hadoop-metrics2.properties... - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Updating CDH4 module, configuring hadoop-metrics2.properties with ganglia aggreagtor hosts .. Updating CDH4 module, configuring hadoop-metrics2.properties with ganglia aggreagtor hosts Change-Id: I837485d86472969b59e60c239d03e791e07b287b --- M manifests/role/analytics/hadoop.pp M modules/cdh4 2 files changed, 4 insertions(+), 0 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/manifests/role/analytics/hadoop.pp b/manifests/role/analytics/hadoop.pp index 2ebbb79..fd3757a 100644 --- a/manifests/role/analytics/hadoop.pp +++ b/manifests/role/analytics/hadoop.pp @@ -87,6 +87,8 @@ mapreduce_task_io_sort_factor => 10, yarn_nodemanager_resource_memory_mb => 40960, yarn_resourcemanager_scheduler_class=> 'org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler', +# TODO: use variables from new ganglia module once it is finished. +ganglia_hosts => ['239.192.1.32:8649'], } file { "$::cdh4::hadoop::config_directory/fair-scheduler.xml": @@ -130,6 +132,8 @@ mapreduce_reduce_tasks_maximum => 2, mapreduce_job_reuse_jvm_num_tasks => 1, yarn_resourcemanager_scheduler_class=> 'org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler', +# TODO: use variables from new ganglia module once it is finished. +ganglia_hosts => ['10.4.0.79:8649'], } file { "$::cdh4::hadoop::config_directory/fair-scheduler.xml": diff --git a/modules/cdh4 b/modules/cdh4 index b5f7b70..02718e3 16 --- a/modules/cdh4 +++ b/modules/cdh4 -Subproject commit b5f7b70b2191528e9df2a81d2361e1a1fdf2469b +Subproject commit 02718e3af97b2b0cddf9f428d32e4904b0389576 -- To view, visit https://gerrit.wikimedia.org/r/67424 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I837485d86472969b59e60c239d03e791e07b287b Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding --delete flag to rsync public datasets cron job - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/67478 Change subject: Adding --delete flag to rsync public datasets cron job .. Adding --delete flag to rsync public datasets cron job Change-Id: Ie7e25cd7b778cf7b1f5bcb747614a47256e2a0b3 --- M manifests/misc/statistics.pp 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/78/67478/1 diff --git a/manifests/misc/statistics.pp b/manifests/misc/statistics.pp index cfe11fa..abef878 100644 --- a/manifests/misc/statistics.pp +++ b/manifests/misc/statistics.pp @@ -196,7 +196,7 @@ # rsync from stat1:/a/public-datasets to /var/www/public-datasets/ cron { 'rsync public datasets': - command => '/usr/bin/rsync -rt stat1.wikimedia.org::a/public-datasets/* /var/www/public-datasets/', + command => '/usr/bin/rsync -rt --delete stat1.wikimedia.org::a/public-datasets/* /var/www/public-datasets/', require => File['/var/www/public-datasets'], user=> 'root', hour=> '*', -- To view, visit https://gerrit.wikimedia.org/r/67478 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ie7e25cd7b778cf7b1f5bcb747614a47256e2a0b3 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding --delete flag to rsync public datasets cron job - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Adding --delete flag to rsync public datasets cron job .. Adding --delete flag to rsync public datasets cron job Change-Id: Ie7e25cd7b778cf7b1f5bcb747614a47256e2a0b3 --- M manifests/misc/statistics.pp 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/misc/statistics.pp b/manifests/misc/statistics.pp index cfe11fa..abef878 100644 --- a/manifests/misc/statistics.pp +++ b/manifests/misc/statistics.pp @@ -196,7 +196,7 @@ # rsync from stat1:/a/public-datasets to /var/www/public-datasets/ cron { 'rsync public datasets': - command => '/usr/bin/rsync -rt stat1.wikimedia.org::a/public-datasets/* /var/www/public-datasets/', + command => '/usr/bin/rsync -rt --delete stat1.wikimedia.org::a/public-datasets/* /var/www/public-datasets/', require => File['/var/www/public-datasets'], user=> 'root', hour=> '*', -- To view, visit https://gerrit.wikimedia.org/r/67478 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ie7e25cd7b778cf7b1f5bcb747614a47256e2a0b3 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Installing OpenJDK Java 7 instead of Sun/Oracle Java 6 on ne... - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/67832 Change subject: Installing OpenJDK Java 7 instead of Sun/Oracle Java 6 on newly reinstalled analytics nodes. .. Installing OpenJDK Java 7 instead of Sun/Oracle Java 6 on newly reinstalled analytics nodes. Also tabs -> 4 spaces. Change-Id: Ie052ba254d47368c813f4957095c1d1ad5f772cc --- M manifests/role/analytics.pp 1 file changed, 91 insertions(+), 55 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/32/67832/1 diff --git a/manifests/role/analytics.pp b/manifests/role/analytics.pp index 2b98019..7510c98 100644 --- a/manifests/role/analytics.pp +++ b/manifests/role/analytics.pp @@ -3,77 +3,113 @@ @monitor_group { "analytics-eqiad": description => "analytics servers in eqiad" } class role::analytics { - system_role { "role::analytics": description => "analytics server" } - $nagios_group = "analytics-eqiad" - # ganglia cluster name. - $cluster = "analytics" +system_role { "role::analytics": description => "analytics server" } +$nagios_group = "analytics-eqiad" +# ganglia cluster name. +$cluster = "analytics" - include standard - include admins::roots +include standard +include admins::roots - # Include stats system user to - # run automated jobs and for file - # ownership. - include misc::statistics::user +# Include stats system user to +# run automated jobs and for file +# ownership. +include misc::statistics::user - # include analytics user accounts - include role::analytics::users +# include analytics user accounts +include role::analytics::users - # Install Sun/Oracle Java JDK on analytics cluster - java { "java-6-oracle": - distribution => 'oracle', - version => 6, - } +# all analytics nodes need java installed +include role::analytics::java - # We want to be able to geolocate IP addresses - include geoip +# We want to be able to geolocate IP addresses +include geoip - # udp-filter is a useful thing! - include misc::udp2log::udp_filter +# udp-filter is a useful thing! +include misc::udp2log::udp_filter +} + +# Contains list of reinstalled analytics nodes. +# Once all analytics nodes are reinstalled, this +# class will be removed. +class role::analytics::reinstalled { +$nodes = [ +'analytics1001', +'analytics1020', +] +} + +class role::analytics::java { +# Most analytics nodes currently are running +# Sun/Oracle Java 6. As we reinstall these nodes, +# we want to switch over to Java 7. +# The following conditional will be removed once +# all nodes have been reinstalled. +include role::analytics::reinstalled + +if (member($role::analytics::reinstalled::nodes, $hostname)) { +java { "java-7-openjdk": +distribution => 'openjdk', +version => 7, +} +# Install Sun/Oracle Java JDK on analytics cluster +java { "java-6-oracle": +distribution => 'oracle', +version => 6, +ensure => 'absent' +} +} +else { +# Install Sun/Oracle Java JDK on analytics cluster +java { "java-6-oracle": +distribution => 'oracle', +version => 6, +} +} } class role::analytics::users { - # Analytics user accounts will be added to the - # 'stats' group which gets created by this class. - require misc::statistics::user +# Analytics user accounts will be added to the +# 'stats' group which gets created by this class. +require misc::statistics::user - include accounts::diederik, - accounts::dsc, - accounts::otto, - accounts::dartar, - accounts::erosen, - accounts::olivneh, - accounts::erik, - accounts::milimetric, - accounts::yurik, # RT 5158 - accounts::spetrea, # RT 4402 - accounts::ram # RT 5059 +include accounts::diederik, +accounts::dsc, +accounts::otto, +accounts::dartar, +accounts::erosen, +accounts::olivneh, +accounts::erik, +accounts::milimetric, +accounts::yurik, # RT 5158 +accounts::spetrea, # RT 4402 +accounts::ram # RT 5059 - # add Analytics team members to the stats group so they can - # access data group owned by 'stats'. - User<|title == milimetric|> { groups +> [ "stats" ] } - User<|title == yurik|> { groups +> [ "stats" ] } - User<|title == dartar|> { groups +> [ "stats" ] } - User<|title == dsc|> { groups +> [ "stats" ] } - User<|title =
[MediaWiki-commits] [Gerrit] Installing OpenJDK Java 7 instead of Sun/Oracle Java 6 on ne... - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Installing OpenJDK Java 7 instead of Sun/Oracle Java 6 on newly reinstalled analytics nodes. .. Installing OpenJDK Java 7 instead of Sun/Oracle Java 6 on newly reinstalled analytics nodes. Also tabs -> 4 spaces. Change-Id: Ie052ba254d47368c813f4957095c1d1ad5f772cc --- M manifests/role/analytics.pp 1 file changed, 91 insertions(+), 55 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/manifests/role/analytics.pp b/manifests/role/analytics.pp index 2b98019..7510c98 100644 --- a/manifests/role/analytics.pp +++ b/manifests/role/analytics.pp @@ -3,77 +3,113 @@ @monitor_group { "analytics-eqiad": description => "analytics servers in eqiad" } class role::analytics { - system_role { "role::analytics": description => "analytics server" } - $nagios_group = "analytics-eqiad" - # ganglia cluster name. - $cluster = "analytics" +system_role { "role::analytics": description => "analytics server" } +$nagios_group = "analytics-eqiad" +# ganglia cluster name. +$cluster = "analytics" - include standard - include admins::roots +include standard +include admins::roots - # Include stats system user to - # run automated jobs and for file - # ownership. - include misc::statistics::user +# Include stats system user to +# run automated jobs and for file +# ownership. +include misc::statistics::user - # include analytics user accounts - include role::analytics::users +# include analytics user accounts +include role::analytics::users - # Install Sun/Oracle Java JDK on analytics cluster - java { "java-6-oracle": - distribution => 'oracle', - version => 6, - } +# all analytics nodes need java installed +include role::analytics::java - # We want to be able to geolocate IP addresses - include geoip +# We want to be able to geolocate IP addresses +include geoip - # udp-filter is a useful thing! - include misc::udp2log::udp_filter +# udp-filter is a useful thing! +include misc::udp2log::udp_filter +} + +# Contains list of reinstalled analytics nodes. +# Once all analytics nodes are reinstalled, this +# class will be removed. +class role::analytics::reinstalled { +$nodes = [ +'analytics1001', +'analytics1020', +] +} + +class role::analytics::java { +# Most analytics nodes currently are running +# Sun/Oracle Java 6. As we reinstall these nodes, +# we want to switch over to Java 7. +# The following conditional will be removed once +# all nodes have been reinstalled. +include role::analytics::reinstalled + +if (member($role::analytics::reinstalled::nodes, $hostname)) { +java { "java-7-openjdk": +distribution => 'openjdk', +version => 7, +} +# Install Sun/Oracle Java JDK on analytics cluster +java { "java-6-oracle": +distribution => 'oracle', +version => 6, +ensure => 'absent' +} +} +else { +# Install Sun/Oracle Java JDK on analytics cluster +java { "java-6-oracle": +distribution => 'oracle', +version => 6, +} +} } class role::analytics::users { - # Analytics user accounts will be added to the - # 'stats' group which gets created by this class. - require misc::statistics::user +# Analytics user accounts will be added to the +# 'stats' group which gets created by this class. +require misc::statistics::user - include accounts::diederik, - accounts::dsc, - accounts::otto, - accounts::dartar, - accounts::erosen, - accounts::olivneh, - accounts::erik, - accounts::milimetric, - accounts::yurik, # RT 5158 - accounts::spetrea, # RT 4402 - accounts::ram # RT 5059 +include accounts::diederik, +accounts::dsc, +accounts::otto, +accounts::dartar, +accounts::erosen, +accounts::olivneh, +accounts::erik, +accounts::milimetric, +accounts::yurik, # RT 5158 +accounts::spetrea, # RT 4402 +accounts::ram # RT 5059 - # add Analytics team members to the stats group so they can - # access data group owned by 'stats'. - User<|title == milimetric|> { groups +> [ "stats" ] } - User<|title == yurik|> { groups +> [ "stats" ] } - User<|title == dartar|> { groups +> [ "stats" ] } - User<|title == dsc|> { groups +> [ "stats" ] } - User<|title == diederik|>{ groups +> [ "stat
[MediaWiki-commits] [Gerrit] Prepping analytics1020 for reinstall. - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Prepping analytics1020 for reinstall. .. Prepping analytics1020 for reinstall. Change-Id: I146bf98401671ecc534aaa38fc08b990b1f44aaf --- M manifests/site.pp 1 file changed, 14 insertions(+), 0 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/site.pp b/manifests/site.pp index 34b828c..65e254e 100644 --- a/manifests/site.pp +++ b/manifests/site.pp @@ -127,6 +127,8 @@ interface_add_ip6_mapped { "main": } } + + # analytics1001.wikimedia.org is the analytics cluster master. node "analytics1001.wikimedia.org" { include role::analytics @@ -135,6 +137,18 @@ include misc::udp2log::iptables } +# analytics1020 is a Hadoop Worker +node "analytics1020.eqiad.wmnet" { + include role::analytics +} + + + + + + +### Analytics nodes below this line need to be reinstalled. + node "analytics1002.eqiad.wmnet" { include role::analytics } -- To view, visit https://gerrit.wikimedia.org/r/67834 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I146bf98401671ecc534aaa38fc08b990b1f44aaf Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Prepping analytics1020 for reinstall. - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/67834 Change subject: Prepping analytics1020 for reinstall. .. Prepping analytics1020 for reinstall. Change-Id: I146bf98401671ecc534aaa38fc08b990b1f44aaf --- M manifests/site.pp 1 file changed, 14 insertions(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/34/67834/1 diff --git a/manifests/site.pp b/manifests/site.pp index 34b828c..65e254e 100644 --- a/manifests/site.pp +++ b/manifests/site.pp @@ -127,6 +127,8 @@ interface_add_ip6_mapped { "main": } } + + # analytics1001.wikimedia.org is the analytics cluster master. node "analytics1001.wikimedia.org" { include role::analytics @@ -135,6 +137,18 @@ include misc::udp2log::iptables } +# analytics1020 is a Hadoop Worker +node "analytics1020.eqiad.wmnet" { + include role::analytics +} + + + + + + +### Analytics nodes below this line need to be reinstalled. + node "analytics1002.eqiad.wmnet" { include role::analytics } -- To view, visit https://gerrit.wikimedia.org/r/67834 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I146bf98401671ecc534aaa38fc08b990b1f44aaf Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Reverting change to install OpenJDK 7 on analytics nodes. - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Reverting change to install OpenJDK 7 on analytics nodes. .. Reverting change to install OpenJDK 7 on analytics nodes. See: http://wiki.apache.org/hadoop/HadoopJavaVersions http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Release-Notes/cdh4rn_topic_2_2.html "Note*: OpenJDK6 has some open bugs w.r.t handling of generics (https://bugs.launchpad.net/ubuntu/+source/openjdk-6/+bug/611284, https://bugs.launchpad.net/ubuntu/+source/openjdk-6/+bug/716959), so OpenJDK cannot be used to compile hadoop mapreduce code in branch-0.23 and beyond, please use other JDKs." "MRv2 (YARN) is not supported on JDK 7 at present, because of MAPREDUCE-2264. This problem is expected to be fixed in an upcoming release." Change-Id: I36a3e7df237021723e2a04d669c099f319dddf69 --- M manifests/role/analytics.pp 1 file changed, 7 insertions(+), 32 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/manifests/role/analytics.pp b/manifests/role/analytics.pp index 7510c98..f211f50 100644 --- a/manifests/role/analytics.pp +++ b/manifests/role/analytics.pp @@ -19,14 +19,18 @@ # include analytics user accounts include role::analytics::users -# all analytics nodes need java installed -include role::analytics::java - # We want to be able to geolocate IP addresses include geoip # udp-filter is a useful thing! include misc::udp2log::udp_filter + +# all analytics nodes need java installed +# Install Sun/Oracle Java JDK on analytics cluster +java { "java-6-oracle": +distribution => 'oracle', +version => 6, +} } # Contains list of reinstalled analytics nodes. @@ -37,35 +41,6 @@ 'analytics1001', 'analytics1020', ] -} - -class role::analytics::java { -# Most analytics nodes currently are running -# Sun/Oracle Java 6. As we reinstall these nodes, -# we want to switch over to Java 7. -# The following conditional will be removed once -# all nodes have been reinstalled. -include role::analytics::reinstalled - -if (member($role::analytics::reinstalled::nodes, $hostname)) { -java { "java-7-openjdk": -distribution => 'openjdk', -version => 7, -} -# Install Sun/Oracle Java JDK on analytics cluster -java { "java-6-oracle": -distribution => 'oracle', -version => 6, -ensure => 'absent' -} -} -else { -# Install Sun/Oracle Java JDK on analytics cluster -java { "java-6-oracle": -distribution => 'oracle', -version => 6, -} -} } class role::analytics::users { -- To view, visit https://gerrit.wikimedia.org/r/67837 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I36a3e7df237021723e2a04d669c099f319dddf69 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Reverting change to install OpenJDK 7 on analytics nodes. - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/67837 Change subject: Reverting change to install OpenJDK 7 on analytics nodes. .. Reverting change to install OpenJDK 7 on analytics nodes. See: http://wiki.apache.org/hadoop/HadoopJavaVersions http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Release-Notes/cdh4rn_topic_2_2.html "Note*: OpenJDK6 has some open bugs w.r.t handling of generics (https://bugs.launchpad.net/ubuntu/+source/openjdk-6/+bug/611284, https://bugs.launchpad.net/ubuntu/+source/openjdk-6/+bug/716959), so OpenJDK cannot be used to compile hadoop mapreduce code in branch-0.23 and beyond, please use other JDKs." "MRv2 (YARN) is not supported on JDK 7 at present, because of MAPREDUCE-2264. This problem is expected to be fixed in an upcoming release." Change-Id: I36a3e7df237021723e2a04d669c099f319dddf69 --- M manifests/role/analytics.pp 1 file changed, 7 insertions(+), 32 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/37/67837/1 diff --git a/manifests/role/analytics.pp b/manifests/role/analytics.pp index 7510c98..f211f50 100644 --- a/manifests/role/analytics.pp +++ b/manifests/role/analytics.pp @@ -19,14 +19,18 @@ # include analytics user accounts include role::analytics::users -# all analytics nodes need java installed -include role::analytics::java - # We want to be able to geolocate IP addresses include geoip # udp-filter is a useful thing! include misc::udp2log::udp_filter + +# all analytics nodes need java installed +# Install Sun/Oracle Java JDK on analytics cluster +java { "java-6-oracle": +distribution => 'oracle', +version => 6, +} } # Contains list of reinstalled analytics nodes. @@ -37,35 +41,6 @@ 'analytics1001', 'analytics1020', ] -} - -class role::analytics::java { -# Most analytics nodes currently are running -# Sun/Oracle Java 6. As we reinstall these nodes, -# we want to switch over to Java 7. -# The following conditional will be removed once -# all nodes have been reinstalled. -include role::analytics::reinstalled - -if (member($role::analytics::reinstalled::nodes, $hostname)) { -java { "java-7-openjdk": -distribution => 'openjdk', -version => 7, -} -# Install Sun/Oracle Java JDK on analytics cluster -java { "java-6-oracle": -distribution => 'oracle', -version => 6, -ensure => 'absent' -} -} -else { -# Install Sun/Oracle Java JDK on analytics cluster -java { "java-6-oracle": -distribution => 'oracle', -version => 6, -} -} } class role::analytics::users { -- To view, visit https://gerrit.wikimedia.org/r/67837 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I36a3e7df237021723e2a04d669c099f319dddf69 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] RT 5161 - erosen access to stat1001 - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: RT 5161 - erosen access to stat1001 .. RT 5161 - erosen access to stat1001 Change-Id: Ife2b45628012c34fcc31df77822e364b5454026e --- M manifests/site.pp 1 file changed, 3 insertions(+), 1 deletion(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/site.pp b/manifests/site.pp index 25a5700..4d2b87f 100644 --- a/manifests/site.pp +++ b/manifests/site.pp @@ -2629,7 +2629,9 @@ accounts::milimetric, accounts::rfaulk, #rt4258 # RT 4687 - accounts::ypanda + accounts::ypanda, + # RT 5161 + accounts::erosen sudo_user { "otto": privileges => ['ALL = NOPASSWD: ALL'] } } -- To view, visit https://gerrit.wikimedia.org/r/68189 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ife2b45628012c34fcc31df77822e364b5454026e Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] RT 5161 - erosen access to stat1001 - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/68189 Change subject: RT 5161 - erosen access to stat1001 .. RT 5161 - erosen access to stat1001 Change-Id: Ife2b45628012c34fcc31df77822e364b5454026e --- M manifests/site.pp 1 file changed, 3 insertions(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/89/68189/1 diff --git a/manifests/site.pp b/manifests/site.pp index 25a5700..4d2b87f 100644 --- a/manifests/site.pp +++ b/manifests/site.pp @@ -2629,7 +2629,9 @@ accounts::milimetric, accounts::rfaulk, #rt4258 # RT 4687 - accounts::ypanda + accounts::ypanda, + # RT 5161 + accounts::erosen sudo_user { "otto": privileges => ['ALL = NOPASSWD: ALL'] } } -- To view, visit https://gerrit.wikimedia.org/r/68189 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ife2b45628012c34fcc31df77822e364b5454026e Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding proper .md formatting for code block to README.md - change (operations...zookeeper)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/68210 Change subject: Adding proper .md formatting for code block to README.md .. Adding proper .md formatting for code block to README.md Change-Id: Ia31df558c74227c96672a67e94aa0d6a6c914d74 --- M README.md 1 file changed, 4 insertions(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet/zookeeper refs/changes/10/68210/1 diff --git a/README.md b/README.md index 8da7b61..52b5d58 100644 --- a/README.md +++ b/README.md @@ -7,10 +7,12 @@ # Usage +```puppet class { 'zookeeper': hosts=> { 'zoo1.domain.org' => 1, 'zoo2.domain.org' => 2, 'zoo3.domain.org' => 3 }, data_dir => '/var/lib/zookeeper', } +``` The above setup should be used to configure a 3 node zookeeper cluster. You can include the above class on any of your nodes that will need to talk @@ -18,7 +20,9 @@ On the 3 zookeeper server nodes, you should also include: +```puppet class { 'zookeeper::server': } +``` This will ensure that the zookeeper server is running. Remember that this requires that you also include the -- To view, visit https://gerrit.wikimedia.org/r/68210 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ia31df558c74227c96672a67e94aa0d6a6c914d74 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet/zookeeper Gerrit-Branch: master Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding proper .md formatting for code block to README.md - change (operations...zookeeper)
Ottomata has submitted this change and it was merged. Change subject: Adding proper .md formatting for code block to README.md .. Adding proper .md formatting for code block to README.md Change-Id: Ia31df558c74227c96672a67e94aa0d6a6c914d74 --- M README.md 1 file changed, 4 insertions(+), 0 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/README.md b/README.md index 8da7b61..52b5d58 100644 --- a/README.md +++ b/README.md @@ -7,10 +7,12 @@ # Usage +```puppet class { 'zookeeper': hosts=> { 'zoo1.domain.org' => 1, 'zoo2.domain.org' => 2, 'zoo3.domain.org' => 3 }, data_dir => '/var/lib/zookeeper', } +``` The above setup should be used to configure a 3 node zookeeper cluster. You can include the above class on any of your nodes that will need to talk @@ -18,7 +20,9 @@ On the 3 zookeeper server nodes, you should also include: +```puppet class { 'zookeeper::server': } +``` This will ensure that the zookeeper server is running. Remember that this requires that you also include the -- To view, visit https://gerrit.wikimedia.org/r/68210 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ia31df558c74227c96672a67e94aa0d6a6c914d74 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet/zookeeper Gerrit-Branch: master Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] No need to have a special 'BROKER_JMX_PORT' variable if kafk... - change (operations...kafka)
Ottomata has submitted this change and it was merged. Change subject: No need to have a special 'BROKER_JMX_PORT' variable if kafka.default is only read by kafka.init .. No need to have a special 'BROKER_JMX_PORT' variable if kafka.default is only read by kafka.init Change-Id: I76c233da8dd376ba48d0322fe78b6436c6060b60 --- M debian/kafka.default M debian/kafka.init 2 files changed, 3 insertions(+), 3 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/debian/kafka.default b/debian/kafka.default index 4977469..f2d406a 100755 --- a/debian/kafka.default +++ b/debian/kafka.default @@ -2,8 +2,8 @@ KAFKA_START=no # The default JMX_PORT for Kafka Brokers is . -# Set BROKER_JMX_PORT to something else to override this. -# BROKER_JMX_PORT= +# Set JMX_PORT to something else to override this. +# JMX_PORT= # JMX options KAFKA_JMX_OPTS=${KAFKA_JMX_OPTS:="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false"} diff --git a/debian/kafka.init b/debian/kafka.init index 4bcc953..ee53834 100755 --- a/debian/kafka.init +++ b/debian/kafka.init @@ -80,7 +80,7 @@ kafka_sh() { # Escape any double quotes in the value of JAVA_OPTS JAVA_OPTS="$(echo $JAVA_OPTS | sed 's/\"/\\\"/g')" - JMX_PORT=${BROKER_JMX_PORT:-} + JMX_PORT=${JMX_PORT:-} KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT" # Define the command to run kafka as a daemon -- To view, visit https://gerrit.wikimedia.org/r/68443 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I76c233da8dd376ba48d0322fe78b6436c6060b60 Gerrit-PatchSet: 1 Gerrit-Project: operations/debs/kafka Gerrit-Branch: debian Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] No need to have a special 'BROKER_JMX_PORT' variable if kafk... - change (operations...kafka)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/68443 Change subject: No need to have a special 'BROKER_JMX_PORT' variable if kafka.default is only read by kafka.init .. No need to have a special 'BROKER_JMX_PORT' variable if kafka.default is only read by kafka.init Change-Id: I76c233da8dd376ba48d0322fe78b6436c6060b60 --- M debian/kafka.default M debian/kafka.init 2 files changed, 3 insertions(+), 3 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/debs/kafka refs/changes/43/68443/1 diff --git a/debian/kafka.default b/debian/kafka.default index 4977469..f2d406a 100755 --- a/debian/kafka.default +++ b/debian/kafka.default @@ -2,8 +2,8 @@ KAFKA_START=no # The default JMX_PORT for Kafka Brokers is . -# Set BROKER_JMX_PORT to something else to override this. -# BROKER_JMX_PORT= +# Set JMX_PORT to something else to override this. +# JMX_PORT= # JMX options KAFKA_JMX_OPTS=${KAFKA_JMX_OPTS:="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false"} diff --git a/debian/kafka.init b/debian/kafka.init index 4bcc953..ee53834 100755 --- a/debian/kafka.init +++ b/debian/kafka.init @@ -80,7 +80,7 @@ kafka_sh() { # Escape any double quotes in the value of JAVA_OPTS JAVA_OPTS="$(echo $JAVA_OPTS | sed 's/\"/\\\"/g')" - JMX_PORT=${BROKER_JMX_PORT:-} + JMX_PORT=${JMX_PORT:-} KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT" # Define the command to run kafka as a daemon -- To view, visit https://gerrit.wikimedia.org/r/68443 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I76c233da8dd376ba48d0322fe78b6436c6060b60 Gerrit-PatchSet: 1 Gerrit-Project: operations/debs/kafka Gerrit-Branch: debian Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Making cdh4::hadoop::directory { '/user/hdfs': require Cdh4:... - change (operations...cdh4)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/69146 Change subject: Making cdh4::hadoop::directory { '/user/hdfs': require Cdh4::Hadoop::Directory['/user'], .. Making cdh4::hadoop::directory { '/user/hdfs': require Cdh4::Hadoop::Directory['/user'], Change-Id: I15880d544618cbfe5d39744ce3cd599c100d9b83 --- M manifests/hadoop/namenode.pp 1 file changed, 14 insertions(+), 12 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet/cdh4 refs/changes/46/69146/1 diff --git a/manifests/hadoop/namenode.pp b/manifests/hadoop/namenode.pp index 0f77138..e12e283 100644 --- a/manifests/hadoop/namenode.pp +++ b/manifests/hadoop/namenode.pp @@ -56,32 +56,33 @@ # sudo -u hdfs hadoop fs -mkdir /tmp # sudo -u hdfs hadoop fs -chmod 1777 /tmp cdh4::hadoop::directory { '/tmp': -owner => 'hdfs', -group => 'hdfs', -mode => '1777', +owner => 'hdfs', +group => 'hdfs', +mode=> '1777', } # sudo -u hdfs hadoop fs -mkdir /user # sudo -u hdfs hadoop fs -chmod 0775 /user # sudo -u hdfs hadoop fs -chown hdfs:hadoop /user cdh4::hadoop::directory { '/user': -owner => 'hdfs', -group => 'hadoop', -mode => '0775', +owner => 'hdfs', +group => 'hadoop', +mode=> '0775', } # sudo -u hdfs hadoop fs -mkdir /user/hdfs cdh4::hadoop::directory { '/user/hdfs': -owner => 'hdfs', -group => 'hdfs', -mode => '0755', +owner => 'hdfs', +group => 'hdfs', +mode=> '0755', +require => Cdh4::Hadoop::Directory['/user'], } # sudo -u hdfs hadoop fs -mkdir /var cdh4::hadoop::directory { '/var': -owner => 'hdfs', -group => 'hdfs', -mode => '0755', +owner => 'hdfs', +group => 'hdfs', +mode=> '0755', } # sudo -u hdfs hadoop fs -mkdir /var/lib @@ -98,6 +99,7 @@ group => 'hdfs', mode=> '0755', require => Cdh4::Hadoop::Directory['/var'], + } } -- To view, visit https://gerrit.wikimedia.org/r/69146 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I15880d544618cbfe5d39744ce3cd599c100d9b83 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet/cdh4 Gerrit-Branch: master Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Making cdh4::hadoop::directory { '/user/hdfs': require Cdh4:... - change (operations...cdh4)
Ottomata has submitted this change and it was merged. Change subject: Making cdh4::hadoop::directory { '/user/hdfs': require Cdh4::Hadoop::Directory['/user'], .. Making cdh4::hadoop::directory { '/user/hdfs': require Cdh4::Hadoop::Directory['/user'], Change-Id: I15880d544618cbfe5d39744ce3cd599c100d9b83 --- M manifests/hadoop/namenode.pp 1 file changed, 13 insertions(+), 12 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/hadoop/namenode.pp b/manifests/hadoop/namenode.pp index 0f77138..5afda70 100644 --- a/manifests/hadoop/namenode.pp +++ b/manifests/hadoop/namenode.pp @@ -56,32 +56,33 @@ # sudo -u hdfs hadoop fs -mkdir /tmp # sudo -u hdfs hadoop fs -chmod 1777 /tmp cdh4::hadoop::directory { '/tmp': -owner => 'hdfs', -group => 'hdfs', -mode => '1777', +owner => 'hdfs', +group => 'hdfs', +mode=> '1777', } # sudo -u hdfs hadoop fs -mkdir /user # sudo -u hdfs hadoop fs -chmod 0775 /user # sudo -u hdfs hadoop fs -chown hdfs:hadoop /user cdh4::hadoop::directory { '/user': -owner => 'hdfs', -group => 'hadoop', -mode => '0775', +owner => 'hdfs', +group => 'hadoop', +mode=> '0775', } # sudo -u hdfs hadoop fs -mkdir /user/hdfs cdh4::hadoop::directory { '/user/hdfs': -owner => 'hdfs', -group => 'hdfs', -mode => '0755', +owner => 'hdfs', +group => 'hdfs', +mode=> '0755', +require => Cdh4::Hadoop::Directory['/user'], } # sudo -u hdfs hadoop fs -mkdir /var cdh4::hadoop::directory { '/var': -owner => 'hdfs', -group => 'hdfs', -mode => '0755', +owner => 'hdfs', +group => 'hdfs', +mode=> '0755', } # sudo -u hdfs hadoop fs -mkdir /var/lib -- To view, visit https://gerrit.wikimedia.org/r/69146 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I15880d544618cbfe5d39744ce3cd599c100d9b83 Gerrit-PatchSet: 2 Gerrit-Project: operations/puppet/cdh4 Gerrit-Branch: master Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Fixing packet_loss_log file on oxygen udp2log instance - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/69272 Change subject: Fixing packet_loss_log file on oxygen udp2log instance .. Fixing packet_loss_log file on oxygen udp2log instance Change-Id: I0e485795ca8c7e644f79e3c30660554c11c86b8a --- M manifests/role/logging.pp 1 file changed, 1 insertion(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/72/69272/1 diff --git a/manifests/role/logging.pp b/manifests/role/logging.pp index 5192dfb..bb9304a 100644 --- a/manifests/role/logging.pp +++ b/manifests/role/logging.pp @@ -274,6 +274,7 @@ misc::udp2log::instance { 'oxygen': multicast => true, + packet_loss_log => '/var/log/udp2log/packet-loss.log' log_directory => $webrequest_log_directory, } } -- To view, visit https://gerrit.wikimedia.org/r/69272 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I0e485795ca8c7e644f79e3c30660554c11c86b8a Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Fixing packet_loss_log file on oxygen udp2log instance - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Fixing packet_loss_log file on oxygen udp2log instance .. Fixing packet_loss_log file on oxygen udp2log instance Change-Id: I0e485795ca8c7e644f79e3c30660554c11c86b8a --- M manifests/role/logging.pp 1 file changed, 3 insertions(+), 2 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/role/logging.pp b/manifests/role/logging.pp index 5192dfb..d09aa12 100644 --- a/manifests/role/logging.pp +++ b/manifests/role/logging.pp @@ -273,8 +273,9 @@ $webrequest_log_directory= "$log_directory/webrequest" misc::udp2log::instance { 'oxygen': - multicast => true, - log_directory => $webrequest_log_directory, + multicast => true, + packet_loss_log => '/var/log/udp2log/packet-loss.log', + log_directory => $webrequest_log_directory, } } -- To view, visit https://gerrit.wikimedia.org/r/69272 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I0e485795ca8c7e644f79e3c30660554c11c86b8a Gerrit-PatchSet: 2 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Lowering alert thresholds on kakfa-broker-ProduceRequestsPer... - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/69307 Change subject: Lowering alert thresholds on kakfa-broker-ProduceRequestsPerSecond .. Lowering alert thresholds on kakfa-broker-ProduceRequestsPerSecond Change-Id: Id92025011d8a440339e792fa0b795be67a71fae0 --- M manifests/misc/analytics.pp 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/07/69307/1 diff --git a/manifests/misc/analytics.pp b/manifests/misc/analytics.pp index bea13d8..cf598fa 100644 --- a/manifests/misc/analytics.pp +++ b/manifests/misc/analytics.pp @@ -32,7 +32,7 @@ # for this udp2log instance. monitor_service { "kakfa-broker-ProduceRequestsPerSecond": description => "kafka_network_SocketServerStats.ProduceRequestsPerSecond", - check_command => "check_kafka_broker_produce_requests!3!2", + check_command => "check_kafka_broker_produce_requests!2!1", contact_group => "analytics", } } -- To view, visit https://gerrit.wikimedia.org/r/69307 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Id92025011d8a440339e792fa0b795be67a71fae0 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Lowering alert thresholds on kakfa-broker-ProduceRequestsPer... - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Lowering alert thresholds on kakfa-broker-ProduceRequestsPerSecond .. Lowering alert thresholds on kakfa-broker-ProduceRequestsPerSecond Change-Id: Id92025011d8a440339e792fa0b795be67a71fae0 --- M manifests/misc/analytics.pp 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/misc/analytics.pp b/manifests/misc/analytics.pp index bea13d8..cf598fa 100644 --- a/manifests/misc/analytics.pp +++ b/manifests/misc/analytics.pp @@ -32,7 +32,7 @@ # for this udp2log instance. monitor_service { "kakfa-broker-ProduceRequestsPerSecond": description => "kafka_network_SocketServerStats.ProduceRequestsPerSecond", - check_command => "check_kafka_broker_produce_requests!3!2", + check_command => "check_kafka_broker_produce_requests!2!1", contact_group => "analytics", } } -- To view, visit https://gerrit.wikimedia.org/r/69307 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Id92025011d8a440339e792fa0b795be67a71fae0 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding spetrea to admins::mortals so he has an account on ba... - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/69313 Change subject: Adding spetrea to admins::mortals so he has an account on bastion hosts. .. Adding spetrea to admins::mortals so he has an account on bastion hosts. Stefan needs and already has access to stat1002.eqiad.wmnet, but does not have an account on bast1001.wikimedia.org. Change-Id: Id8ac1335c1b149010f0bbcadee4d6d82469e2e38 --- M manifests/admins.pp 1 file changed, 1 insertion(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/13/69313/1 diff --git a/manifests/admins.pp b/manifests/admins.pp index 985f155..d849ebf 100644 --- a/manifests/admins.pp +++ b/manifests/admins.pp @@ -2977,6 +2977,7 @@ include accounts::rmoen include accounts::robla include accounts::spage + include accounts::spetrea include accounts::sumanah #RT 3752 include accounts::yurik #rt 4835, rt 5069 include accounts::zak # access revoked -- To view, visit https://gerrit.wikimedia.org/r/69313 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Id8ac1335c1b149010f0bbcadee4d6d82469e2e38 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Puppetizing hive client, server and metastore. - change (operations...cdh4)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/69353 Change subject: Puppetizing hive client, server and metastore. .. Puppetizing hive client, server and metastore. This change is not yet ready for review! Change-Id: I82b36cbdf4322d2b90caabbe0e7308a35dddbb5d --- M README.md M manifests/hive.pp A manifests/hive/defaults.pp A manifests/hive/master.pp A manifests/hive/metastore.pp A manifests/hive/metastore/mysql.pp A manifests/hive/server.pp M manifests/sqoop.pp A templates/hive/hive-site.xml.erb M tests/Makefile M tests/hive.pp A tests/hive_master.pp A tests/hive_metastore.pp A tests/hive_metastore_mysql.pp A tests/hive_server.pp 15 files changed, 575 insertions(+), 8 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet/cdh4 refs/changes/53/69353/1 diff --git a/README.md b/README.md index 549c141..6730a1b 100644 --- a/README.md +++ b/README.md @@ -4,15 +4,19 @@ Cloudera's Distribution 4 (CDH4) for Apache Hadoop. # Description + Installs HDFS, YARN or MR1, Hive, HBase, Pig, Sqoop, Zookeeper, Oozie and Hue. Note that, in order for this module to work, you will have to ensure that: -* Sun JRE version 6 or greater is installed -* Your package manager is configured with a repository containing the +- Sun JRE version 6 or greater is installed +- Your package manager is configured with a repository containing the Cloudera 4 packages. +This module has been tested using CDH 4.2.1 on Ubuntu Precise 12.04.2 LTS + # Installation: + Clone (or copy) this repository into your puppet modules/cdh4 directory: ```bash git clone git://github.com/wikimedia/cloudera-cdh4-puppet.git modules/cdh4 @@ -28,6 +32,7 @@ # Usage ## For all Hadoop nodes: + ```puppet include cdh4 @@ -51,14 +56,44 @@ If you would like to use MRv1 instead of YARN, set ```use_yarn``` to false. ## For your Hadoop master node: + ```puppet include cdh4::hadoop::master ``` This installs and starts up the NameNode. If using YARN this will install and set up ResourceManager and HistoryServer. If using MRv1, this will install and set up the JobTracker. ### For your Hadoop worker nodes: + ```puppet include cdh4::hadoop::worker ``` This installs and starts up the DataNode. If using YARN, this will install and set up the NodeManager. If using MRv1, this will install and set up the TaskTracker. + +## For all Hive enabled nodes: + +```puppet +class { 'cdh4::hive': + zookeeper_hosts => ['zk1.domain.org', 'zk2.domain.org'], + jdbc_password => $secret_password, +} +``` + +## For your Hive master node (hive-server2 and hive-metastore): + +Include the same ```cdh4::hive``` class as indicated above, and then: + +```puppet +class { 'cdh4::hive::master': } +``` + +By default, a Hive metastore backend of MySQL will be used. You must separately +ensure that your $metastore_database (e.g. mysql) package is installed. +If you want to disable automatic configuration of your metastore backend +database, set the ```metastore_database``` parameter to undef: + +```puppet +class { 'cdh4::hive::master': + metastore_database => undef, +} +``` diff --git a/manifests/hive.pp b/manifests/hive.pp index 71ca4bf..ded40dd 100644 --- a/manifests/hive.pp +++ b/manifests/hive.pp @@ -3,8 +3,74 @@ # Installs Hive packages (needed for Hive Client). # Use cdh4::hive::server to install and set up a Hive server. # -class cdh4::hive { +# == Parameters +# $zookeeper_hosts - Array of zookeeper hostname/IP(:port)s. +#Default: undef (zookeeper lock management +#will not be used). +# +# $jdbc_database - Metastore JDBC database name. +#Default: 'hive_metastore' +# $jdbc_username - Metastore JDBC username. Default: hive +# $jdbc_password - Metastore JDBC password. Default: hive +# $jdbc_host - Metastore JDBC hostname. Default: localhost +# $jdbc_driver - Metastore JDBC driver class name. +#Default: org.apache.derby.jdbc.EmbeddedDriver +# $jdbc_protocol - Metastore JDBC protocol. Default: mysql +# +# $exec_parallel_thread_number - Number of jobs at most can be executed in parallel. +#Set this to 0 to disable parallel execution. +# $optimize_skewjoin - Enable or disable skew join optimization. +#Default: false +# $skewjoin_key- Number of rows where skew join is used. +# - Default: 1 +# $skewjoin_mapjoin_map_tasks - Number of map tasks used in the follow up +#map join jobfor a skew join. Default: 1. +# $skewjoin_mapjoin_min_split - Skew join minimum split size. Default: 33554432 +# +# $stats_enabled - Enable or disable
[MediaWiki-commits] [Gerrit] Including role::analytics::hadoop::worker on analytics1019 a... - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Including role::analytics::hadoop::worker on analytics1019 and analytics1020 .. Including role::analytics::hadoop::worker on analytics1019 and analytics1020 Change-Id: I2a07806cf1dec67fa41e92c4223cb8b500367688 --- M manifests/role/analytics.pp M manifests/role/analytics/hadoop.pp M manifests/site.pp 3 files changed, 16 insertions(+), 32 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/manifests/role/analytics.pp b/manifests/role/analytics.pp index f211f50..ce094c0 100644 --- a/manifests/role/analytics.pp +++ b/manifests/role/analytics.pp @@ -31,16 +31,16 @@ distribution => 'oracle', version => 6, } -} -# Contains list of reinstalled analytics nodes. -# Once all analytics nodes are reinstalled, this -# class will be removed. -class role::analytics::reinstalled { -$nodes = [ -'analytics1001', -'analytics1020', -] + +# Include these common classes on all analytics nodes. +# (for now we only include these on reinstalled and +# fully puppetized nodes.) +if ($hostname =~ /analytics10(19|20)/) { +include role::analytics::pig +include role::analytics::hive +include role::analytics::sqoop +} } class role::analytics::users { @@ -77,6 +77,7 @@ sudo_user { [ "diederik", "dsc", "otto" ]: privileges => ['ALL = (ALL) NOPASSWD: ALL'] } } + # front end interfaces for Kraken and Hadoop class role::analytics::frontend inherits role::analytics { # include a mysql database for Sqoop and Oozie diff --git a/manifests/role/analytics/hadoop.pp b/manifests/role/analytics/hadoop.pp index fd3757a..c05179a 100644 --- a/manifests/role/analytics/hadoop.pp +++ b/manifests/role/analytics/hadoop.pp @@ -99,10 +99,6 @@ content => template('hadoop/fair-scheduler-allocation.xml.erb'), require => Class['cdh4::hadoop'], } - -include cdh4::hive -include cdh4::pig -include cdh4::sqoop } @@ -144,10 +140,6 @@ content => template('hadoop/fair-scheduler-allocation.xml.erb'), require => Class['cdh4::hadoop'], } - -include cdh4::hive -include cdh4::pig -include cdh4::sqoop } diff --git a/manifests/site.pp b/manifests/site.pp index ebee377..06c2ea1 100644 --- a/manifests/site.pp +++ b/manifests/site.pp @@ -136,14 +136,11 @@ include misc::udp2log::iptables } -# analytics1020 is a Hadoop Worker -node "analytics1020.eqiad.wmnet" { +# analytics1011-analytics1020 are Hadoop Worker nodes +node /analytics10(19|20).eqiad.wmnet/ { include role::analytics + include role::analytics::hadoop::worker } - - - - ### Analytics nodes below this line need to be reinstalled. @@ -190,8 +187,8 @@ include role::analytics::kafka::server } -# analytics1007, analytics1009-analytics1020, analytics1023-analytics1027 -node /analytics10(0[7]|1[0-9]|2[234567])\.eqiad\.wmnet/ { +# analytics1007, analytics1009-analytics1018, analytics1023-analytics1027 +node /analytics10(0[7]|1[0-8]|2[234567])\.eqiad\.wmnet/ { # ganglia aggregator for the Analytics cluster. if ($hostname == "analytics1011") { $ganglia_aggregator = true @@ -200,13 +197,7 @@ include role::analytics } -# # analytics1011-analytics1020 are Kraken Hadoop Datanodes. -# # TODO: Puppetize all Hadoop Datanodes. analytics1020 -# # is being used as the first puppetization test. -# node "analytics1020.eqiad.wmnet" { -# include role::analytics -# include role::hadoop::worker -# } + # analytics1027 hosts the frontend # interfaces to Kraken and Hadoop. -- To view, visit https://gerrit.wikimedia.org/r/69518 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I2a07806cf1dec67fa41e92c4223cb8b500367688 Gerrit-PatchSet: 2 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding role/analytics/hive, pig, sqoop.pp - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Adding role/analytics/hive,pig,sqoop.pp .. Adding role/analytics/hive,pig,sqoop.pp Change-Id: Ifc03bdb08b99b55a3a2f955b2d9e2245402ee831 --- A manifests/role/analytics/hive.pp A manifests/role/analytics/pig.pp A manifests/role/analytics/sqoop.pp 3 files changed, 11 insertions(+), 0 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/role/analytics/hive.pp b/manifests/role/analytics/hive.pp new file mode 100644 index 000..f96ed1a --- /dev/null +++ b/manifests/role/analytics/hive.pp @@ -0,0 +1,5 @@ +# == Class role::analytics::hive +# +class role::analytics::hive { +include cdh4::hive +} \ No newline at end of file diff --git a/manifests/role/analytics/pig.pp b/manifests/role/analytics/pig.pp new file mode 100644 index 000..765f9d9 --- /dev/null +++ b/manifests/role/analytics/pig.pp @@ -0,0 +1,3 @@ +class role::analytics::pig { +include cdh4::pig +} \ No newline at end of file diff --git a/manifests/role/analytics/sqoop.pp b/manifests/role/analytics/sqoop.pp new file mode 100644 index 000..0908633 --- /dev/null +++ b/manifests/role/analytics/sqoop.pp @@ -0,0 +1,3 @@ +class role::analytics::sqoop { +include cdh4::sqoop +} \ No newline at end of file -- To view, visit https://gerrit.wikimedia.org/r/69521 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ifc03bdb08b99b55a3a2f955b2d9e2245402ee831 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding role/analytics/hive, pig, sqoop.pp - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/69521 Change subject: Adding role/analytics/hive,pig,sqoop.pp .. Adding role/analytics/hive,pig,sqoop.pp Change-Id: Ifc03bdb08b99b55a3a2f955b2d9e2245402ee831 --- A manifests/role/analytics/hive.pp A manifests/role/analytics/pig.pp A manifests/role/analytics/sqoop.pp 3 files changed, 11 insertions(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/21/69521/1 diff --git a/manifests/role/analytics/hive.pp b/manifests/role/analytics/hive.pp new file mode 100644 index 000..f96ed1a --- /dev/null +++ b/manifests/role/analytics/hive.pp @@ -0,0 +1,5 @@ +# == Class role::analytics::hive +# +class role::analytics::hive { +include cdh4::hive +} \ No newline at end of file diff --git a/manifests/role/analytics/pig.pp b/manifests/role/analytics/pig.pp new file mode 100644 index 000..765f9d9 --- /dev/null +++ b/manifests/role/analytics/pig.pp @@ -0,0 +1,3 @@ +class role::analytics::pig { +include cdh4::pig +} \ No newline at end of file diff --git a/manifests/role/analytics/sqoop.pp b/manifests/role/analytics/sqoop.pp new file mode 100644 index 000..0908633 --- /dev/null +++ b/manifests/role/analytics/sqoop.pp @@ -0,0 +1,3 @@ +class role::analytics::sqoop { +include cdh4::sqoop +} \ No newline at end of file -- To view, visit https://gerrit.wikimedia.org/r/69521 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ifc03bdb08b99b55a3a2f955b2d9e2245402ee831 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Including role::analytics::hadoop::worker on analytics1019 a... - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/69518 Change subject: Including role::analytics::hadoop::worker on analytics1019 and analytics1020 .. Including role::analytics::hadoop::worker on analytics1019 and analytics1020 Change-Id: I2a07806cf1dec67fa41e92c4223cb8b500367688 --- M manifests/role/analytics.pp M manifests/role/analytics/hadoop.pp M manifests/site.pp 3 files changed, 16 insertions(+), 32 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/18/69518/1 diff --git a/manifests/role/analytics.pp b/manifests/role/analytics.pp index f211f50..ce094c0 100644 --- a/manifests/role/analytics.pp +++ b/manifests/role/analytics.pp @@ -31,16 +31,16 @@ distribution => 'oracle', version => 6, } -} -# Contains list of reinstalled analytics nodes. -# Once all analytics nodes are reinstalled, this -# class will be removed. -class role::analytics::reinstalled { -$nodes = [ -'analytics1001', -'analytics1020', -] + +# Include these common classes on all analytics nodes. +# (for now we only include these on reinstalled and +# fully puppetized nodes.) +if ($hostname =~ /analytics10(19|20)/) { +include role::analytics::pig +include role::analytics::hive +include role::analytics::sqoop +} } class role::analytics::users { @@ -77,6 +77,7 @@ sudo_user { [ "diederik", "dsc", "otto" ]: privileges => ['ALL = (ALL) NOPASSWD: ALL'] } } + # front end interfaces for Kraken and Hadoop class role::analytics::frontend inherits role::analytics { # include a mysql database for Sqoop and Oozie diff --git a/manifests/role/analytics/hadoop.pp b/manifests/role/analytics/hadoop.pp index fd3757a..c05179a 100644 --- a/manifests/role/analytics/hadoop.pp +++ b/manifests/role/analytics/hadoop.pp @@ -99,10 +99,6 @@ content => template('hadoop/fair-scheduler-allocation.xml.erb'), require => Class['cdh4::hadoop'], } - -include cdh4::hive -include cdh4::pig -include cdh4::sqoop } @@ -144,10 +140,6 @@ content => template('hadoop/fair-scheduler-allocation.xml.erb'), require => Class['cdh4::hadoop'], } - -include cdh4::hive -include cdh4::pig -include cdh4::sqoop } diff --git a/manifests/site.pp b/manifests/site.pp index ebee377..26b9319 100644 --- a/manifests/site.pp +++ b/manifests/site.pp @@ -136,14 +136,11 @@ include misc::udp2log::iptables } -# analytics1020 is a Hadoop Worker -node "analytics1020.eqiad.wmnet" { +# analytics1011-analytics1020 are Hadoop Worker nodes +node /analytics10(19|20).eqiad.wmnet { include role::analytics + include role::analytics::hadoop::worker } - - - - ### Analytics nodes below this line need to be reinstalled. @@ -190,8 +187,8 @@ include role::analytics::kafka::server } -# analytics1007, analytics1009-analytics1020, analytics1023-analytics1027 -node /analytics10(0[7]|1[0-9]|2[234567])\.eqiad\.wmnet/ { +# analytics1007, analytics1009-analytics1018, analytics1023-analytics1027 +node /analytics10(0[7]|1[0-8]|2[234567])\.eqiad\.wmnet/ { # ganglia aggregator for the Analytics cluster. if ($hostname == "analytics1011") { $ganglia_aggregator = true @@ -200,13 +197,7 @@ include role::analytics } -# # analytics1011-analytics1020 are Kraken Hadoop Datanodes. -# # TODO: Puppetize all Hadoop Datanodes. analytics1020 -# # is being used as the first puppetization test. -# node "analytics1020.eqiad.wmnet" { -# include role::analytics -# include role::hadoop::worker -# } + # analytics1027 hosts the frontend # interfaces to Kraken and Hadoop. -- To view, visit https://gerrit.wikimedia.org/r/69518 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I2a07806cf1dec67fa41e92c4223cb8b500367688 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Fixing pig.properties.erb comment - change (operations...cdh4)
Ottomata has submitted this change and it was merged. Change subject: Fixing pig.properties.erb comment .. Fixing pig.properties.erb comment Change-Id: I69a5cb4c7e93ff1c944431115284ecd840725416 --- M templates/pig/pig.properties.erb 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/templates/pig/pig.properties.erb b/templates/pig/pig.properties.erb index ea1ff44..004d2e6 100644 --- a/templates/pig/pig.properties.erb +++ b/templates/pig/pig.properties.erb @@ -27,7 +27,7 @@ # hod realted properties #ssh.gateway #hod.expect.root -#hod.expect.useensure => installed +#hod.expect.uselatest #hod.command #hod.config.dir #hod.param -- To view, visit https://gerrit.wikimedia.org/r/69523 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I69a5cb4c7e93ff1c944431115284ecd840725416 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet/cdh4 Gerrit-Branch: master Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Fixing pig.properties.erb comment - change (operations...cdh4)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/69523 Change subject: Fixing pig.properties.erb comment .. Fixing pig.properties.erb comment Change-Id: I69a5cb4c7e93ff1c944431115284ecd840725416 --- M templates/pig/pig.properties.erb 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet/cdh4 refs/changes/23/69523/1 diff --git a/templates/pig/pig.properties.erb b/templates/pig/pig.properties.erb index ea1ff44..004d2e6 100644 --- a/templates/pig/pig.properties.erb +++ b/templates/pig/pig.properties.erb @@ -27,7 +27,7 @@ # hod realted properties #ssh.gateway #hod.expect.root -#hod.expect.useensure => installed +#hod.expect.uselatest #hod.command #hod.config.dir #hod.param -- To view, visit https://gerrit.wikimedia.org/r/69523 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I69a5cb4c7e93ff1c944431115284ecd840725416 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet/cdh4 Gerrit-Branch: master Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Changing metrics.wikimedia.org htpasswd - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/64942 Change subject: Changing metrics.wikimedia.org htpasswd .. Changing metrics.wikimedia.org htpasswd Change-Id: I3a785546d8d5ed1c46df21ab4c6ae62ce908d27b --- M manifests/misc/statistics.pp 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/42/64942/1 diff --git a/manifests/misc/statistics.pp b/manifests/misc/statistics.pp index ca7baaa..423e6d6 100644 --- a/manifests/misc/statistics.pp +++ b/manifests/misc/statistics.pp @@ -383,7 +383,7 @@ # install a .htpasswd file for E3 file { "$e3_home/.htpasswd": - content => 'e3:$apr1$krR9Lhez$Yr0Ya9GpCW8KRQLeyR5Rn.', + content => $passwords::e3::metrics::htpasswd_content, owner=> $metrics_user, group=> "wikidev", mode => 0664, -- To view, visit https://gerrit.wikimedia.org/r/64942 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I3a785546d8d5ed1c46df21ab4c6ae62ce908d27b Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Changing metrics.wikimedia.org htpasswd - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Changing metrics.wikimedia.org htpasswd .. Changing metrics.wikimedia.org htpasswd Change-Id: I3a785546d8d5ed1c46df21ab4c6ae62ce908d27b --- M manifests/misc/statistics.pp 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/manifests/misc/statistics.pp b/manifests/misc/statistics.pp index ca7baaa..423e6d6 100644 --- a/manifests/misc/statistics.pp +++ b/manifests/misc/statistics.pp @@ -383,7 +383,7 @@ # install a .htpasswd file for E3 file { "$e3_home/.htpasswd": - content => 'e3:$apr1$krR9Lhez$Yr0Ya9GpCW8KRQLeyR5Rn.', + content => $passwords::e3::metrics::htpasswd_content, owner=> $metrics_user, group=> "wikidev", mode => 0664, -- To view, visit https://gerrit.wikimedia.org/r/64942 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I3a785546d8d5ed1c46df21ab4c6ae62ce908d27b Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding ganglia view for Kafka stats - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/65122 Change subject: Adding ganglia view for Kafka stats .. Adding ganglia view for Kafka stats Change-Id: I7c59f9c110eb659db63992e532172ebb998969b2 --- M manifests/misc/monitoring.pp 1 file changed, 51 insertions(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/22/65122/1 diff --git a/manifests/misc/monitoring.pp b/manifests/misc/monitoring.pp index d18fdcf..b8a9502 100644 --- a/manifests/misc/monitoring.pp +++ b/manifests/misc/monitoring.pp @@ -50,6 +50,10 @@ misc::monitoring::view::udp2log { 'udp2log': host_regex => 'locke|emery|oxygen|gadolinium', } + misc::monitoring::view::kafka { 'kafka': + kafka_broker_host_regexn => 'analytics10[12].eqiad.wmnet', + kafka_producer_host_regex => 'analytics100[345689].eqiad.wmnet', + } } # == Define misc:monitoring::view::udp2log @@ -60,7 +64,6 @@ # # == Parameters: # $host_regex - regex to pass to ganglia::view for matching host names in the view. -# $conf_dir # define misc::monitoring::view::udp2log($host_regex) { ganglia::view { $name: @@ -114,4 +117,51 @@ }, ], } +} + + +# == Define misc:monitoring::view::kafka +# Installs a ganglia::view for a group of nodes +# running kafka broker servers. This is just a wrapper for +# kafka specific metrics to include in kafka +# +# == Parameters: +# $kafka_broker_host_regex - regex matching kafka broker hosts +# kafka_producer_host_regex - regex matching kafka producer hosts +# +define misc::monitoring::view::kafka($kafka_broker_host_regex, $kafka_producer_host_regex) { + ganglia::view { $name: + graphs => [ + { + 'host_regex' => $kafka_broker_host_regex, + 'metric_regex' => 'kafka_network_SocketServerStats.ProduceRequestsPerSecond', + 'type' => 'stack', + }, + { + 'host_regex' => $kafka_broker_host_regex, + 'metric_regex' => 'kafka_network_SocketServerStats.FetchRequestsPerSecond', + 'type' => 'stack', + }, + { + 'host_regex' => $kafka_broker_host_regex, + 'metric_regex' => 'kafka_network_SocketServerStats.BytesWrittenPerSecond', + 'type' => 'stack', + }, + { + 'host_regex' => $kafka_broker_host_regex, + 'metric_regex' => 'kafka_network_SocketServerStats.BytesReadPerSecond', + 'type' => 'stack', + }, + { + 'host_regex' => $kafka_broker_host_regex, + 'metric_regex' => 'kafka_message_LogFlushStats.FlushesPerSecond', + 'type' => 'stack', + }, + { + 'host_regex' => $kafka_producer_host_regex, + 'metric_regex' => 'kafka_producer_KafkaProducerStats-.+.ProduceRequestsPerSecond', + 'type' => 'stack', + }, + ], + } } \ No newline at end of file -- To view, visit https://gerrit.wikimedia.org/r/65122 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I7c59f9c110eb659db63992e532172ebb998969b2 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding ganglia view for Kafka stats - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Adding ganglia view for Kafka stats .. Adding ganglia view for Kafka stats Change-Id: I7c59f9c110eb659db63992e532172ebb998969b2 --- M manifests/misc/monitoring.pp 1 file changed, 51 insertions(+), 1 deletion(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/manifests/misc/monitoring.pp b/manifests/misc/monitoring.pp index d18fdcf..b8a9502 100644 --- a/manifests/misc/monitoring.pp +++ b/manifests/misc/monitoring.pp @@ -50,6 +50,10 @@ misc::monitoring::view::udp2log { 'udp2log': host_regex => 'locke|emery|oxygen|gadolinium', } + misc::monitoring::view::kafka { 'kafka': + kafka_broker_host_regexn => 'analytics10[12].eqiad.wmnet', + kafka_producer_host_regex => 'analytics100[345689].eqiad.wmnet', + } } # == Define misc:monitoring::view::udp2log @@ -60,7 +64,6 @@ # # == Parameters: # $host_regex - regex to pass to ganglia::view for matching host names in the view. -# $conf_dir # define misc::monitoring::view::udp2log($host_regex) { ganglia::view { $name: @@ -114,4 +117,51 @@ }, ], } +} + + +# == Define misc:monitoring::view::kafka +# Installs a ganglia::view for a group of nodes +# running kafka broker servers. This is just a wrapper for +# kafka specific metrics to include in kafka +# +# == Parameters: +# $kafka_broker_host_regex - regex matching kafka broker hosts +# kafka_producer_host_regex - regex matching kafka producer hosts +# +define misc::monitoring::view::kafka($kafka_broker_host_regex, $kafka_producer_host_regex) { + ganglia::view { $name: + graphs => [ + { + 'host_regex' => $kafka_broker_host_regex, + 'metric_regex' => 'kafka_network_SocketServerStats.ProduceRequestsPerSecond', + 'type' => 'stack', + }, + { + 'host_regex' => $kafka_broker_host_regex, + 'metric_regex' => 'kafka_network_SocketServerStats.FetchRequestsPerSecond', + 'type' => 'stack', + }, + { + 'host_regex' => $kafka_broker_host_regex, + 'metric_regex' => 'kafka_network_SocketServerStats.BytesWrittenPerSecond', + 'type' => 'stack', + }, + { + 'host_regex' => $kafka_broker_host_regex, + 'metric_regex' => 'kafka_network_SocketServerStats.BytesReadPerSecond', + 'type' => 'stack', + }, + { + 'host_regex' => $kafka_broker_host_regex, + 'metric_regex' => 'kafka_message_LogFlushStats.FlushesPerSecond', + 'type' => 'stack', + }, + { + 'host_regex' => $kafka_producer_host_regex, + 'metric_regex' => 'kafka_producer_KafkaProducerStats-.+.ProduceRequestsPerSecond', + 'type' => 'stack', + }, + ], + } } \ No newline at end of file -- To view, visit https://gerrit.wikimedia.org/r/65122 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I7c59f9c110eb659db63992e532172ebb998969b2 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Fixing typo - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/65124 Change subject: Fixing typo .. Fixing typo Change-Id: I2247beb88018558bf255e69296168bc58a005586 --- M manifests/misc/monitoring.pp 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/24/65124/1 diff --git a/manifests/misc/monitoring.pp b/manifests/misc/monitoring.pp index b8a9502..261a33b 100644 --- a/manifests/misc/monitoring.pp +++ b/manifests/misc/monitoring.pp @@ -51,7 +51,7 @@ host_regex => 'locke|emery|oxygen|gadolinium', } misc::monitoring::view::kafka { 'kafka': - kafka_broker_host_regexn => 'analytics10[12].eqiad.wmnet', + kafka_broker_host_regex => 'analytics10[12].eqiad.wmnet', kafka_producer_host_regex => 'analytics100[345689].eqiad.wmnet', } } -- To view, visit https://gerrit.wikimedia.org/r/65124 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I2247beb88018558bf255e69296168bc58a005586 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Fixing typo - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Fixing typo .. Fixing typo Change-Id: I2247beb88018558bf255e69296168bc58a005586 --- M manifests/misc/monitoring.pp 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/misc/monitoring.pp b/manifests/misc/monitoring.pp index b8a9502..261a33b 100644 --- a/manifests/misc/monitoring.pp +++ b/manifests/misc/monitoring.pp @@ -51,7 +51,7 @@ host_regex => 'locke|emery|oxygen|gadolinium', } misc::monitoring::view::kafka { 'kafka': - kafka_broker_host_regexn => 'analytics10[12].eqiad.wmnet', + kafka_broker_host_regex => 'analytics10[12].eqiad.wmnet', kafka_producer_host_regex => 'analytics100[345689].eqiad.wmnet', } } -- To view, visit https://gerrit.wikimedia.org/r/65124 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I2247beb88018558bf255e69296168bc58a005586 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Fixing confdir for ganglia::view - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/65131 Change subject: Fixing confdir for ganglia::view .. Fixing confdir for ganglia::view Change-Id: Ibad89b26442541162482eda567346c3185114ded --- M manifests/ganglia.pp 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/31/65131/1 diff --git a/manifests/ganglia.pp b/manifests/ganglia.pp index f3e7d29..3b39e5c 100644 --- a/manifests/ganglia.pp +++ b/manifests/ganglia.pp @@ -614,7 +614,7 @@ $items= [], $view_type= 'standard', $default_size = 'large', - $conf_dir = $ganglia::web::ganglia_confdir, + $conf_dir = "${ganglia::web::ganglia_confdir}/conf", $template = 'ganglia/ganglia_view.json.erb') { require ganglia::web -- To view, visit https://gerrit.wikimedia.org/r/65131 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ibad89b26442541162482eda567346c3185114ded Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Fixing confdir for ganglia::view - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Fixing confdir for ganglia::view .. Fixing confdir for ganglia::view Change-Id: Ibad89b26442541162482eda567346c3185114ded --- M manifests/ganglia.pp 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/ganglia.pp b/manifests/ganglia.pp index f3e7d29..3b39e5c 100644 --- a/manifests/ganglia.pp +++ b/manifests/ganglia.pp @@ -614,7 +614,7 @@ $items= [], $view_type= 'standard', $default_size = 'large', - $conf_dir = $ganglia::web::ganglia_confdir, + $conf_dir = "${ganglia::web::ganglia_confdir}/conf", $template = 'ganglia/ganglia_view.json.erb') { require ganglia::web -- To view, visit https://gerrit.wikimedia.org/r/65131 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ibad89b26442541162482eda567346c3185114ded Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding git submodule operations/pupppet/cdh4 at modules/cdh4 - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/65247 Change subject: Adding git submodule operations/pupppet/cdh4 at modules/cdh4 .. Adding git submodule operations/pupppet/cdh4 at modules/cdh4 Change-Id: I874d6deb65b07fd7d7ed3407d40dd81e77711533 --- A .gitmodules A modules/cdh4 2 files changed, 3 insertions(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/47/65247/1 diff --git a/.gitmodules b/.gitmodules new file mode 100644 index 000..48bb333 --- /dev/null +++ b/.gitmodules @@ -0,0 +1,3 @@ +[submodule "modules/cdh4"] + path = modules/cdh4 + url = https://gerrit.wikimedia.org/r/operations/puppet/cdh4 diff --git a/modules/cdh4 b/modules/cdh4 new file mode 16 index 000..79452a3 --- /dev/null +++ b/modules/cdh4 +Subproject commit 79452a332dc23b8400686f96bb45a18c18241796 -- To view, visit https://gerrit.wikimedia.org/r/65247 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I874d6deb65b07fd7d7ed3407d40dd81e77711533 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding git submodule operations/pupppet/cdh4 at modules/cdh4 - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Adding git submodule operations/pupppet/cdh4 at modules/cdh4 .. Adding git submodule operations/pupppet/cdh4 at modules/cdh4 Change-Id: I874d6deb65b07fd7d7ed3407d40dd81e77711533 --- A .gitmodules A modules/cdh4 2 files changed, 3 insertions(+), 0 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/.gitmodules b/.gitmodules new file mode 100644 index 000..48bb333 --- /dev/null +++ b/.gitmodules @@ -0,0 +1,3 @@ +[submodule "modules/cdh4"] + path = modules/cdh4 + url = https://gerrit.wikimedia.org/r/operations/puppet/cdh4 diff --git a/modules/cdh4 b/modules/cdh4 new file mode 16 index 000..79452a3 --- /dev/null +++ b/modules/cdh4 +Subproject commit 79452a332dc23b8400686f96bb45a18c18241796 -- To view, visit https://gerrit.wikimedia.org/r/65247 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I874d6deb65b07fd7d7ed3407d40dd81e77711533 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding classes to install hive, pig and sqoop. (very simple!) - change (operations...cdh4)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/65267 Change subject: Adding classes to install hive, pig and sqoop. (very simple!) .. Adding classes to install hive, pig and sqoop. (very simple!) Hive Server and Hive Metastore puppetiztaion will come in a separate commit. Change-Id: Iaa7769ce39e7002edb2b2476df3f850c611b4e6b --- M TODO.md A manifests/hive.pp A manifests/pig.pp A manifests/sqoop.pp A templates/pig/pig.properties.erb 5 files changed, 98 insertions(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet/cdh4 refs/changes/67/65267/1 diff --git a/TODO.md b/TODO.md index 2fb0842..9dac694 100644 --- a/TODO.md +++ b/TODO.md @@ -11,4 +11,16 @@ - Make log4j.properties more configurable. - Support Secondary NameNode. - Support High Availability NameNode. -- Make JMX ports configurable. \ No newline at end of file +- Make JMX ports configurable. + +## Hive +- Hive Server + Hive Metastore + +## Oozie + +## Hue + +## HBase + +## Zookeeper + diff --git a/manifests/hive.pp b/manifests/hive.pp new file mode 100644 index 000..5bac968 --- /dev/null +++ b/manifests/hive.pp @@ -0,0 +1,10 @@ +# == Class cdh4::hive +# +# Installs Hive packages (needed for Hive Client). +# Use cdh4::hive::server to install and set up a Hive server. +# +class cdh4::hive { + package { 'hive': +ensure => 'installed', + } +} \ No newline at end of file diff --git a/manifests/pig.pp b/manifests/pig.pp new file mode 100644 index 000..d9689a6 --- /dev/null +++ b/manifests/pig.pp @@ -0,0 +1,14 @@ +# == Class cdh4::pig +# +# Installs and configures Apache Pig. +# +class cdh4::pig { + package { 'pig': +ensure => 'installed', + } + + file { '/etc/pig/pig.properties': +content => template('cdh4/pig/pig.properties.erb'), +require => Package['pig'], + } +} diff --git a/manifests/sqoop.pp b/manifests/sqoop.pp new file mode 100644 index 000..f6c1699 --- /dev/null +++ b/manifests/sqoop.pp @@ -0,0 +1,7 @@ +# == Class cdh4::sqoop +# Installs Sqoop +class cdh4::sqoop { + package { 'sqoop': +ensure => 'installed', + } +} \ No newline at end of file diff --git a/templates/pig/pig.properties.erb b/templates/pig/pig.properties.erb new file mode 100644 index 000..ea1ff44 --- /dev/null +++ b/templates/pig/pig.properties.erb @@ -0,0 +1,54 @@ +# Pig configuration file. All values can be overwritten by command line arguments. +# see bin/pig -help + +# log4jconf log4j configuration file +# log4jconf=./conf/log4j.properties + +# brief logging (no timestamps) +brief=false + +# clustername, name of the hadoop jobtracker. If no port is defined port 50020 will be used. +#cluster + +#debug level, INFO is default +debug=INFO + +# a file that contains pig script +#file= + +# load jarfile, colon separated +#jar= + +#verbose print all log messages to screen (default to print only INFO and above to screen) +verbose=false + +#exectype local|mapreduce, mapreduce is default +#exectype=mapreduce +# hod realted properties +#ssh.gateway +#hod.expect.root +#hod.expect.useensure => installed +#hod.command +#hod.config.dir +#hod.param + + +#Do not spill temp files smaller than this size (bytes) +pig.spill.size.threshold=500 +#EXPERIMENT: Activate garbage collection when spilling a file bigger than this size (bytes) +#This should help reduce the number of files being spilled. +pig.spill.gc.activation.size=4000 + + +## +# Everything below this line is Yahoo specific. Note that I've made +# (almost) no changes to the lines above to make merging in from Apache +# easier. Any values I don't want from above I override below. +# +# This file is configured for use with HOD on the production clusters. If you +# want to run pig with a static cluster you will need to remove everything +# below this line and set the cluster value (above) to the +# hostname and port of your job tracker. + +exectype=mapreduce +log.file= -- To view, visit https://gerrit.wikimedia.org/r/65267 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Iaa7769ce39e7002edb2b2476df3f850c611b4e6b Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet/cdh4 Gerrit-Branch: master Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Fixing kafka_broker_host_regex for ganglia view - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/65277 Change subject: Fixing kafka_broker_host_regex for ganglia view .. Fixing kafka_broker_host_regex for ganglia view Change-Id: Id81471bbb32f297d22e4c0ce2ed892cbb5edba56 --- M manifests/misc/monitoring.pp 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/77/65277/1 diff --git a/manifests/misc/monitoring.pp b/manifests/misc/monitoring.pp index 261a33b..16e3953 100644 --- a/manifests/misc/monitoring.pp +++ b/manifests/misc/monitoring.pp @@ -51,7 +51,7 @@ host_regex => 'locke|emery|oxygen|gadolinium', } misc::monitoring::view::kafka { 'kafka': - kafka_broker_host_regex => 'analytics10[12].eqiad.wmnet', + kafka_broker_host_regex => 'analytics102[12].eqiad.wmnet', kafka_producer_host_regex => 'analytics100[345689].eqiad.wmnet', } } -- To view, visit https://gerrit.wikimedia.org/r/65277 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Id81471bbb32f297d22e4c0ce2ed892cbb5edba56 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Fixing kafka_broker_host_regex for ganglia view - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Fixing kafka_broker_host_regex for ganglia view .. Fixing kafka_broker_host_regex for ganglia view Change-Id: Id81471bbb32f297d22e4c0ce2ed892cbb5edba56 --- M manifests/misc/monitoring.pp 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/misc/monitoring.pp b/manifests/misc/monitoring.pp index 261a33b..16e3953 100644 --- a/manifests/misc/monitoring.pp +++ b/manifests/misc/monitoring.pp @@ -51,7 +51,7 @@ host_regex => 'locke|emery|oxygen|gadolinium', } misc::monitoring::view::kafka { 'kafka': - kafka_broker_host_regex => 'analytics10[12].eqiad.wmnet', + kafka_broker_host_regex => 'analytics102[12].eqiad.wmnet', kafka_producer_host_regex => 'analytics100[345689].eqiad.wmnet', } } -- To view, visit https://gerrit.wikimedia.org/r/65277 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Id81471bbb32f297d22e4c0ce2ed892cbb5edba56 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding CDH4 Zookeeper puppetization - change (operations...cdh4)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/65408 Change subject: Adding CDH4 Zookeeper puppetization .. Adding CDH4 Zookeeper puppetization Change-Id: Ifc5d1675b96a519761c6189e06408ebe20d4a733 --- A manifests/zookeeper.pp A manifests/zookeeper/defaults.pp A manifests/zookeeper/server.pp A templates/zookeeper/log4j.properties.erb A templates/zookeeper/zoo.cfg.erb A templates/zookeeper/zookeeper-env.sh.erb 6 files changed, 190 insertions(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet/cdh4 refs/changes/08/65408/1 diff --git a/manifests/zookeeper.pp b/manifests/zookeeper.pp new file mode 100644 index 000..4efdf6f --- /dev/null +++ b/manifests/zookeeper.pp @@ -0,0 +1,46 @@ +# == Class cdh4::zookeeper +# Installs common zookeeper package and configs. +# +# == Parameters +# $hosts - Array of zookeeper fqdns. The order of this array matters. +# Zookeeper 'myid's will be inferred from the node's index in this array. +# $data_dir - Zookeeper dataDir. +# +# == Usage +# class { 'cdh4::zookeeper': +# hosts=> ['zoo0.domain.org', 'zoo1.domain.org', 'zoo2.domain.org'], +# data_dir => '/var/lib/zookeeper', +# } +# +# The above setup should be used to configure a 3 node zookeeper cluster. +# You can include the above class on any of your nodes that will need to talk +# to the zookeeper cluster. +# +# On the 3 zookeeper server nodes, you should also include: +# +# class { 'cdh4::zookeeper::server': } +# +# This will ensure that the zookeeper server is running. +# Remember that this requires that you also include the +# cdh4::zookeeper class as defined above as well as the server class. +# +# On each of the defined zookeeper hosts, a myid file must be created +# that identifies the host in the zookeeper quorum. This myid number +# will be inferred from the nodes index in the zookeeper hosts array. +# e.g. zoo0.domain.org's myid will be '0', zoo1.domain.org's myid will be 1, etc. +# +class cdh4::zookeeper( + $hosts = $::cdh4::zookeeper::defaults::hosts, + $data_dir = $::cdh4::zookeeper::defaults::data_dir, + $conf_template = $::cdh4::zookeeper::defaults::conf_template, +) inherits cdh4::zookeeper::defaults +{ + package { 'zookeeper': +ensure => 'installed', + } + + file { '/etc/zookeeper/conf/zoo.cfg': +content => template($conf_template), +require => Package['zookeeper'], + } +} \ No newline at end of file diff --git a/manifests/zookeeper/defaults.pp b/manifests/zookeeper/defaults.pp new file mode 100644 index 000..bba24c4 --- /dev/null +++ b/manifests/zookeeper/defaults.pp @@ -0,0 +1,21 @@ +# == Class cdh4::zookeeper::defaults +# Default zookeeper configs. +class cdh4::zookeeper::defaults { + # NOTE: The order of this array matters. + # The Zookeeper 'myid' will be infered by the index of a node's + # fqdn in this array. Changing the order will change the 'myid' + # of zookeeper servers. + $hosts = [$::fqdn] + + $data_dir = '/tmp/zookeeper' + $log_file = '/var/log/zookeeper/zookeeper.log' + $jmx_port = 9998 + + # Default puppet paths to template config files. + # This allows us to use custom template config files + # if we want to override more settings than this + # module yet supports. + $conf_template = 'cdh4/zookeeper/zoo.cfg.erb' + $env_template= 'cdh4/zookeeper/zookeeper-env.sh.erb' + $log4j_template = 'cdh4/zookeeper/log4j.properties.erb' +} \ No newline at end of file diff --git a/manifests/zookeeper/server.pp b/manifests/zookeeper/server.pp new file mode 100644 index 000..1dd0381 --- /dev/null +++ b/manifests/zookeeper/server.pp @@ -0,0 +1,64 @@ +# == Class cdh4::zookeeper::server +# Configures a zookeeper server. +# This requires that cdh4::zookeeper is installed +# And that the current nodes fqdn is an entry in the +# cdh4::zookeeper::hosts array. +# +# == Parameters +# $jmx_port - JMX port. Set this to false if you don't want to expose JMX. +# $log_file - zookeeper.log file. Default: /var/log/zookeeper/zookeeper.log +# +class cdh4::zookeeper::server( + $jmx_port = $::cdh4::zookeeper::defaults::jmx_port, + $log_file = $::cdh4::zookeeper::defaults::log_file, + $env_template = $::cdh4::zookeeper::defaults::env_template, + $log4j_template = $::cdh4::zookeeper::defaults::log4j_template, +) +{ + # need zookeeper common package and config. + Class['cdh4::zookeeper'] -> Class['cdh4::zookeeper::server'] + + package { 'zookeeper-server': +ensure => 'installed', + } + + file { $::cdh4::zookeeper::data_dir: +ensure => 'directory', +owner => 'zookeeper', +group => 'zookeeper', +mode => '0755', + } + + file { '/etc/zookeeper/conf/zookeeper-env.sh': +content => template($env_template), +require => Package['zookeeper-server'], + } + + file { '/etc/zookeeper/conf/log4j.propert
[MediaWiki-commits] [Gerrit] test commit, do not merge. - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/65470 Change subject: test commit, do not merge. .. test commit, do not merge. Change-Id: I46e0f345eca2de4446c7baea7b4250c2446d3108 --- M typos 1 file changed, 1 insertion(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/70/65470/1 diff --git a/typos b/typos index e866550..1656ab4 100644 --- a/typos +++ b/typos @@ -1,3 +1,4 @@ pmpta gadolinum wikmedia + -- To view, visit https://gerrit.wikimedia.org/r/65470 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I46e0f345eca2de4446c7baea7b4250c2446d3108 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Getting oxygen ready for precise upgrade - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/65586 Change subject: Getting oxygen ready for precise upgrade .. Getting oxygen ready for precise upgrade Change-Id: Ifa81308e0f6e893e4c25483322ab15992f8e8a0a --- M manifests/misc/statistics.pp M manifests/role/logging.pp M manifests/site.pp M templates/udp2log/filters.oxygen.erb 4 files changed, 52 insertions(+), 64 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/86/65586/1 diff --git a/manifests/misc/statistics.pp b/manifests/misc/statistics.pp index 049b0e9..3ba61b0 100644 --- a/manifests/misc/statistics.pp +++ b/manifests/misc/statistics.pp @@ -625,7 +625,7 @@ # wikipedia zero logs from oxygen misc::statistics::rsync_job { "wikipedia_zero": - source => "oxygen.wikimedia.org::udp2log/archive/zero-*.gz", + source => "oxygen.wikimedia.org::udp2log/webrequest/archive/zero-*.gz", destination => "/a/squid/archive/zero", } diff --git a/manifests/role/logging.pp b/manifests/role/logging.pp index a057d2c..5192dfb 100644 --- a/manifests/role/logging.pp +++ b/manifests/role/logging.pp @@ -266,6 +266,31 @@ } } +# oxygen is a generic webrequests udp2log host +# mostly running wikipedia zero filters. +class role::logging::udp2log::oxygen inherits role::logging::udp2log { + # udp2log::instance will ensure this is created + $webrequest_log_directory= "$log_directory/webrequest" + + misc::udp2log::instance { 'oxygen': + multicast => true, + log_directory => $webrequest_log_directory, + } +} + +# lucene udp2log instance for capturing search logs +class role::logging::udp2log::lucene inherits role::logging::udp2log { + # udp2log::instance will ensure this is created + $lucene_log_directory= "$log_directory/lucene" + + misc::udp2log::instance { 'lucene': + port => '51234', + log_directory=> $lucene_log_directory, + monitor_packet_loss => false, + } +} + + # EventLogging collector class role::logging::eventlogging { system_role { "misc::log-collector": diff --git a/manifests/site.pp b/manifests/site.pp index 28b8823..4556699 100644 --- a/manifests/site.pp +++ b/manifests/site.pp @@ -2109,47 +2109,10 @@ # RT 4312 accounts::milimetric - include - misc::udp2log - - sudo_user { "otto": privileges => ['ALL = NOPASSWD: ALL'] } - - # oxygen's udp2log instance - # saves logs mainly in /a/squid. - misc::udp2log::instance { "oxygen": - multicast => true, - # TODO: Move this to /a/log/webrequest - log_directory => "/a/squid", - # oxygen's packet-loss.log file is alredy in /var/log/udp2log - packet_loss_log => "/var/log/udp2log/packet-loss.log", - } - - # Set up an rsync daemon module for udp2log logrotated - # archives. This allows stat1 to copy logs from the - # logrotated archive directory - class { "misc::udp2log::rsyncd": - path=> "/a/squid", - require => Misc::Udp2log::Instance["oxygen"], - } - - # udp2log-lucene instance for - # lucene search logs. Don't need - # to monitor packet loss here. - misc::udp2log::instance { "lucene": - port=> "51234", - log_directory => "/a/log/lucene", - monitor_packet_loss => false, - } - - # rsync archived lucene logs over to dataset2 - # These are available for download at http://dumps.wikimedia.org/other/search/ - cron { "search_logs_rsync": - command => "rsync -r /a/log/lucene/archive/lucene.log*.gz dataset2::search-logs/", - hour=> '8', - minute => '0', - user=> 'backup', - ensure => absent, - } + # main oxygen udp2log handles mostly Wikipedia Zero webrequest logs + include role::logging::udp2log::oxygen + # Also include lucene search loggging udp2log instance + include role::logging::udp2log::lucene } node /^payments[1-4]\.wikimedia\.org$/ { diff --git a/templates/udp2log/filters.oxygen.erb b/templates/udp2log/filters.oxygen.erb index a96046f..7050bdf 100644 --- a/templates/udp2log/filters.oxygen.erb +++ b/templates/udp2log/filters.oxygen.erb @@ -11,7 +11,7 @@ # for each mobile service provider/country. # Disabling this until Erosen and Amit give us the ok, # Once they do, we will disable all other IP based filters. -# pipe 1 /usr/bin/awk '$NF != "-"' >> /a/squid/zero-x-cs.tab.log +# pipe 1 /usr/bin/awk '$NF != "-"' >> /a/log/webrequest/zero-x-cs.tsv.log @@ -
[MediaWiki-commits] [Gerrit] Getting oxygen ready for precise upgrade - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Getting oxygen ready for precise upgrade .. Getting oxygen ready for precise upgrade Change-Id: Ifa81308e0f6e893e4c25483322ab15992f8e8a0a --- M manifests/misc/statistics.pp M manifests/role/logging.pp M manifests/site.pp M templates/udp2log/filters.oxygen.erb 4 files changed, 52 insertions(+), 64 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/manifests/misc/statistics.pp b/manifests/misc/statistics.pp index 049b0e9..3ba61b0 100644 --- a/manifests/misc/statistics.pp +++ b/manifests/misc/statistics.pp @@ -625,7 +625,7 @@ # wikipedia zero logs from oxygen misc::statistics::rsync_job { "wikipedia_zero": - source => "oxygen.wikimedia.org::udp2log/archive/zero-*.gz", + source => "oxygen.wikimedia.org::udp2log/webrequest/archive/zero-*.gz", destination => "/a/squid/archive/zero", } diff --git a/manifests/role/logging.pp b/manifests/role/logging.pp index a057d2c..5192dfb 100644 --- a/manifests/role/logging.pp +++ b/manifests/role/logging.pp @@ -266,6 +266,31 @@ } } +# oxygen is a generic webrequests udp2log host +# mostly running wikipedia zero filters. +class role::logging::udp2log::oxygen inherits role::logging::udp2log { + # udp2log::instance will ensure this is created + $webrequest_log_directory= "$log_directory/webrequest" + + misc::udp2log::instance { 'oxygen': + multicast => true, + log_directory => $webrequest_log_directory, + } +} + +# lucene udp2log instance for capturing search logs +class role::logging::udp2log::lucene inherits role::logging::udp2log { + # udp2log::instance will ensure this is created + $lucene_log_directory= "$log_directory/lucene" + + misc::udp2log::instance { 'lucene': + port => '51234', + log_directory=> $lucene_log_directory, + monitor_packet_loss => false, + } +} + + # EventLogging collector class role::logging::eventlogging { system_role { "misc::log-collector": diff --git a/manifests/site.pp b/manifests/site.pp index 28b8823..4556699 100644 --- a/manifests/site.pp +++ b/manifests/site.pp @@ -2109,47 +2109,10 @@ # RT 4312 accounts::milimetric - include - misc::udp2log - - sudo_user { "otto": privileges => ['ALL = NOPASSWD: ALL'] } - - # oxygen's udp2log instance - # saves logs mainly in /a/squid. - misc::udp2log::instance { "oxygen": - multicast => true, - # TODO: Move this to /a/log/webrequest - log_directory => "/a/squid", - # oxygen's packet-loss.log file is alredy in /var/log/udp2log - packet_loss_log => "/var/log/udp2log/packet-loss.log", - } - - # Set up an rsync daemon module for udp2log logrotated - # archives. This allows stat1 to copy logs from the - # logrotated archive directory - class { "misc::udp2log::rsyncd": - path=> "/a/squid", - require => Misc::Udp2log::Instance["oxygen"], - } - - # udp2log-lucene instance for - # lucene search logs. Don't need - # to monitor packet loss here. - misc::udp2log::instance { "lucene": - port=> "51234", - log_directory => "/a/log/lucene", - monitor_packet_loss => false, - } - - # rsync archived lucene logs over to dataset2 - # These are available for download at http://dumps.wikimedia.org/other/search/ - cron { "search_logs_rsync": - command => "rsync -r /a/log/lucene/archive/lucene.log*.gz dataset2::search-logs/", - hour=> '8', - minute => '0', - user=> 'backup', - ensure => absent, - } + # main oxygen udp2log handles mostly Wikipedia Zero webrequest logs + include role::logging::udp2log::oxygen + # Also include lucene search loggging udp2log instance + include role::logging::udp2log::lucene } node /^payments[1-4]\.wikimedia\.org$/ { diff --git a/templates/udp2log/filters.oxygen.erb b/templates/udp2log/filters.oxygen.erb index a96046f..7050bdf 100644 --- a/templates/udp2log/filters.oxygen.erb +++ b/templates/udp2log/filters.oxygen.erb @@ -11,7 +11,7 @@ # for each mobile service provider/country. # Disabling this until Erosen and Amit give us the ok, # Once they do, we will disable all other IP based filters. -# pipe 1 /usr/bin/awk '$NF != "-"' >> /a/squid/zero-x-cs.tab.log +# pipe 1 /usr/bin/awk '$NF != "-"' >> /a/log/webrequest/zero-x-cs.tsv.log @@ -24,71 +24,71 @@ # # Digi Malays
[MediaWiki-commits] [Gerrit] Adding ganglia monitoring of webrequest data loss in Kraken ... - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/65780 Change subject: Adding ganglia monitoring of webrequest data loss in Kraken HDFS .. Adding ganglia monitoring of webrequest data loss in Kraken HDFS Change-Id: Iefc4b809434e357b5cd8aec416434fd45cf18a4c --- A files/ganglia/plugins/kraken_webrequest_loss.py A files/ganglia/plugins/kraken_webrequest_loss.pyconf M manifests/misc/monitoring.pp M manifests/site.pp 4 files changed, 131 insertions(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/80/65780/1 diff --git a/files/ganglia/plugins/kraken_webrequest_loss.py b/files/ganglia/plugins/kraken_webrequest_loss.py new file mode 100644 index 000..7f3698b --- /dev/null +++ b/files/ganglia/plugins/kraken_webrequest_loss.py @@ -0,0 +1,89 @@ +#!/usr/bin/env python +# -*- coding: utf-8 -*- +""" +Python Gmond Module for Kraken Webrequest Loss Percentage. +Loss percentage per source host data is generated by the packetloss +Oozie job in Kraken. + +:copyright: (c) 2012 Wikimedia Foundation +:author: Andrew Otto +:license: GPL + +""" +from __future__ import print_function + +import logging +import commands + +UPDATE_INTERVAL = 3600 # seconds + +# Config for multiple metrics. +# Currently we only compute a single webrequest loss +# percentage, but this allows us to add more later. +metrics = { +'webrequest_loss_average': { +'description': 'Average Webrequest Loss Percentage', +'path':'/wmf/data/webrequest/loss', +} +} + +def latest_loss_path(metric_name): +"""Returns HDFS path to the most recently generated webrequest loss data.""" +logging.debug("latest_loss_path(%s)" % metrics[metric_name]['path']) +return commands.getoutput("/usr/bin/hadoop fs -ls %s | /usr/bin/tail -n 1 | /usr/bin/awk '{print $NF}'" % (metrics[metric_name]['path'])) + +def loss_data(loss_path): +"""Returns the output data inside the HDFS loss_path.""" +logging.debug("loss_data(%s)" % loss_path) +return commands.getoutput("/usr/bin/hadoop fs -cat %s/part*" % (loss_path)) + +def loss_average(loss_data): +"""Parses loss_data for loss percentages and averages them all.""" +logging.debug("loss_average(%s)" % loss_data) +percent_sum = 0.0 +loss_lines = loss_data.split("\n") +for line in loss_lines: +fields = line.split("\t") +percent = fields[-1] +percent_sum += float(percent) + +average_percent = (percent_sum / float(len(loss_lines))) +return average_percent + +def metric_handler(name): +"""Get value of particular metric; part of Gmond interface""" +logging.debug('metric_handler(): %s', name) +return loss_average(loss_data(latest_loss_path(name))) + +def metric_init(params): +global descriptors + +descriptors = [] +for metric_name, metric_config in metrics.items(): +descriptors.append({ +'name': metric_name, +'call_back': metric_handler, +'time_max': 3660, +'value_type': 'float', +'units': '%', +'slope': 'both', +'format': '%f', +'description': metric_config['description'], +'groups': 'analytics' +}) + +return descriptors + + +def metric_cleanup(): +"""Teardown; part of Gmond interface""" +pass + + +if __name__ == '__main__': +# When invoked as standalone script, run a self-test by querying each +# metric descriptor and printing it out. +logging.basicConfig(level=logging.DEBUG) +for metric in metric_init({}): +value = metric['call_back'](metric['name']) +print(( "%s => " + metric['format'] ) % ( metric['name'], value )) diff --git a/files/ganglia/plugins/kraken_webrequest_loss.pyconf b/files/ganglia/plugins/kraken_webrequest_loss.pyconf new file mode 100644 index 000..c4db97b --- /dev/null +++ b/files/ganglia/plugins/kraken_webrequest_loss.pyconf @@ -0,0 +1,20 @@ +# Gmond configuration for calculating +# webrequest data loss stored in HDFS in Kraken. + +modules { + module { +name = "kraken_webrequest_loss" +language = "python" + } +} + +collection_group { + collect_every = 3600 + time_threshold = 3660 + + metric { +name = "webrequest_loss_average" +title = "Average Loss Percentage" +value_threshold = 0 + } +} diff --git a/manifests/misc/monitoring.pp b/manifests/misc/monitoring.pp index 16e3953..5f6c6c1 100644 --- a/manifests/misc/monitoring.pp +++ b/manifests/misc/monitoring.pp @@ -29,6 +29,7 @@ # == Class misc::monitoring::net::udp # Sends UDP statistics to ganglia. +# class misc::monitoring::net::udp { file { '/usr/lib/ganglia/python_modules/udp_stats.py': @@ -42,6 +43,23 @@ } } +# == Class misc::monitoring::kraken::loss +# Checks recently generated webrequest loss statistics in +# Kraken HDFS and se
[MediaWiki-commits] [Gerrit] Adding ganglia monitoring of webrequest data loss in Kraken ... - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Adding ganglia monitoring of webrequest data loss in Kraken HDFS .. Adding ganglia monitoring of webrequest data loss in Kraken HDFS Change-Id: Iefc4b809434e357b5cd8aec416434fd45cf18a4c --- A files/ganglia/plugins/kraken_webrequest_loss.py A files/ganglia/plugins/kraken_webrequest_loss.pyconf M manifests/misc/monitoring.pp M manifests/site.pp 4 files changed, 131 insertions(+), 0 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/files/ganglia/plugins/kraken_webrequest_loss.py b/files/ganglia/plugins/kraken_webrequest_loss.py new file mode 100644 index 000..8b966a8 --- /dev/null +++ b/files/ganglia/plugins/kraken_webrequest_loss.py @@ -0,0 +1,89 @@ +#!/usr/bin/env python +# -*- coding: utf-8 -*- +""" +Python Gmond Module for Kraken Webrequest Loss Percentage. +Loss percentage per source host data is generated by the packetloss +Oozie job in Kraken. + +:copyright: (c) 2012 Wikimedia Foundation +:author: Andrew Otto +:license: GPL + +""" +from __future__ import print_function + +import logging +import commands + +UPDATE_INTERVAL = 3600 # seconds + +# Config for multiple metrics. +# Currently we only compute a single webrequest loss +# percentage, but this allows us to add more later. +metrics = { +'webrequest_loss_average': { +'description': 'Average Webrequest Loss Percentage', +'path':'/wmf/data/webrequest/loss', +} +} + +def latest_loss_path(metric_name): +"""Returns HDFS path to the most recently generated webrequest loss data.""" +logging.debug("latest_loss_path(%s)" % metrics[metric_name]['path']) +return commands.getoutput("/usr/bin/hadoop fs -ls %s | /usr/bin/tail -n 1 | /usr/bin/awk '{print $NF}'" % (metrics[metric_name]['path'])) + +def loss_data(loss_path): +"""Returns the output data inside the HDFS loss_path.""" +logging.debug("loss_data(%s)" % loss_path) +return commands.getoutput("/usr/bin/hadoop fs -cat %s/part*" % (loss_path)) + +def loss_average(loss_data): +"""Parses loss_data for loss percentages and averages them all.""" +logging.debug("loss_average(%s)" % loss_data) +percent_sum = 0.0 +loss_lines = loss_data.split("\n") +for line in loss_lines: +fields = line.split("\t") +percent = fields[-1] +percent_sum += float(percent) + +average_percent = (percent_sum / float(len(loss_lines))) +return average_percent + +def metric_handler(name): +"""Get value of particular metric; part of Gmond interface""" +logging.debug('metric_handler(): %s', name) +return loss_average(loss_data(latest_loss_path(name))) + +def metric_init(params): +global descriptors + +descriptors = [] +for metric_name, metric_config in metrics.items(): +descriptors.append({ +'name': metric_name, +'call_back': metric_handler, +'time_max': 3660, +'value_type': 'float', +'units': '%', +'slope': 'both', +'format': '%f', +'description': metric_config['description'], +'groups': 'analytics' +}) + +return descriptors + + +def metric_cleanup(): +"""Teardown; part of Gmond interface""" +pass + + +if __name__ == '__main__': +# When invoked as standalone script, run a self-test by querying each +# metric descriptor and printing it out. +logging.basicConfig(level=logging.DEBUG) +for metric in metric_init({}): +value = metric['call_back'](metric['name']) +print(( "%s => " + metric['format'] ) % ( metric['name'], value )) diff --git a/files/ganglia/plugins/kraken_webrequest_loss.pyconf b/files/ganglia/plugins/kraken_webrequest_loss.pyconf new file mode 100644 index 000..2ea2fea --- /dev/null +++ b/files/ganglia/plugins/kraken_webrequest_loss.pyconf @@ -0,0 +1,20 @@ +# Gmond configuration for calculating +# webrequest data loss stored in HDFS in Kraken. + +modules { + module { +name = "kraken_webrequest_loss" +language = "python" + } +} + +collection_group { + collect_every = 3600 + time_threshold = 3660 + + metric { +name = "webrequest_loss_average" +title = "Average Loss Percentage" +value_threshold = 0 + } +} diff --git a/manifests/misc/monitoring.pp b/manifests/misc/monitoring.pp index 16e3953..5f6c6c1 100644 --- a/manifests/misc/monitoring.pp +++ b/manifests/misc/monitoring.pp @@ -29,6 +29,7 @@ # == Class misc::monitoring::net::udp # Sends UDP statistics to ganglia. +# class misc::monitoring::net::udp { file { '/usr/lib/ganglia/python_modules/udp_stats.py': @@ -42,6 +43,23 @@ } } +# == Class misc::monitoring::kraken::loss +# Checks recently generated webrequest loss statistics in +# Kraken HDFS and sends the average loss percentage to ganglia. +# +class misc::monit
[MediaWiki-commits] [Gerrit] Fixing eventlogging stat1 rsync job puppetization - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/65821 Change subject: Fixing eventlogging stat1 rsync job puppetization .. Fixing eventlogging stat1 rsync job puppetization Change-Id: Ib9ff61c0f0e21b008a7019cfb5c33fb3699c2439 --- M manifests/misc/statistics.pp 1 file changed, 0 insertions(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/21/65821/1 diff --git a/manifests/misc/statistics.pp b/manifests/misc/statistics.pp index 3ba61b0..206793c 100644 --- a/manifests/misc/statistics.pp +++ b/manifests/misc/statistics.pp @@ -614,7 +614,6 @@ "/a/aft", "/a/aft/archive", "/a/eventlogging", - "/a/eventlogging/archive", "/a/public-datasets", ]: ensure => directory, -- To view, visit https://gerrit.wikimedia.org/r/65821 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ib9ff61c0f0e21b008a7019cfb5c33fb3699c2439 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Fixing eventlogging stat1 rsync job puppetization - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Fixing eventlogging stat1 rsync job puppetization .. Fixing eventlogging stat1 rsync job puppetization Change-Id: Ib9ff61c0f0e21b008a7019cfb5c33fb3699c2439 --- M manifests/misc/statistics.pp 1 file changed, 0 insertions(+), 1 deletion(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/misc/statistics.pp b/manifests/misc/statistics.pp index 3ba61b0..206793c 100644 --- a/manifests/misc/statistics.pp +++ b/manifests/misc/statistics.pp @@ -614,7 +614,6 @@ "/a/aft", "/a/aft/archive", "/a/eventlogging", - "/a/eventlogging/archive", "/a/public-datasets", ]: ensure => directory, -- To view, visit https://gerrit.wikimedia.org/r/65821 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ib9ff61c0f0e21b008a7019cfb5c33fb3699c2439 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding alerts for webrequest data loss in HDFS - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/66241 Change subject: Adding alerts for webrequest data loss in HDFS .. Adding alerts for webrequest data loss in HDFS Change-Id: If91ce8badded15a2d15e8a0be42735ebe80f5968 --- M manifests/misc/analytics.pp M manifests/misc/monitoring.pp M templates/icinga/checkcommands.cfg.erb 3 files changed, 28 insertions(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/41/66241/1 diff --git a/manifests/misc/analytics.pp b/manifests/misc/analytics.pp index 66655da..bea13d8 100644 --- a/manifests/misc/analytics.pp +++ b/manifests/misc/analytics.pp @@ -35,4 +35,4 @@ check_command => "check_kafka_broker_produce_requests!3!2", contact_group => "analytics", } -} \ No newline at end of file +} diff --git a/manifests/misc/monitoring.pp b/manifests/misc/monitoring.pp index 5f6c6c1..a2ec64c 100644 --- a/manifests/misc/monitoring.pp +++ b/manifests/misc/monitoring.pp @@ -58,6 +58,20 @@ source => "puppet:///files/ganglia/plugins/kraken_webrequest_loss.pyconf", notify => Service[gmond]; } + + # Set up icinga monitoring of Kraken HDFS data loss. + monitor_service { "kraken_webrequest_loss_average_positive": + description => "webrequest_loss_average_positive", + check_command => "check_kraken_webrequest_loss_positive!2!8", + contact_group => "analytics", + } + # It is possible to have negative data loss. This would mean that + # we are receiving duplicates log lines. We need alerts for this too. + monitor_service { "kraken_webrequest_loss_average_negative": + description => "webrequest_loss_average_negative", + check_command => "check_kraken_webrequest_loss_negative!-2!-8", + contact_group => "analytics", + } } # Ganglia views that should be diff --git a/templates/icinga/checkcommands.cfg.erb b/templates/icinga/checkcommands.cfg.erb index 830ba4a..b156f7e 100644 --- a/templates/icinga/checkcommands.cfg.erb +++ b/templates/icinga/checkcommands.cfg.erb @@ -621,4 +621,17 @@ command_line$USER1$/check_ganglios_generic_value -H $HOSTADDRESS$ -m kafka_network_SocketServerStats.ProduceRequestsPerSecond -w $ARG1$ -c $ARG2$ -o lt } +# Alerts for data loss in Kraken HDFS. +define command{ + command_namecheck_kraken_webrequest_loss_positive + command_line$USER1$/check_ganglios_generic_value -H $HOSTADDRESS$ -m webrequest_loss_average -w $ARG1$ -c $ARG2$ -o gt +} + +# Data loss percentage CAN be negative if we receive duplicate traffic +# (this has happened before). We need an extra alert if the percentages goes negative. +define command{ + command_namecheck_kraken_webrequest_loss_negative + command_line$USER1$/check_ganglios_generic_value -H $HOSTADDRESS$ -m webrequest_loss_average -w $ARG1$ -c $ARG2$ -o lt +} + -- To view, visit https://gerrit.wikimedia.org/r/66241 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: If91ce8badded15a2d15e8a0be42735ebe80f5968 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding alerts for webrequest data loss in HDFS - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Adding alerts for webrequest data loss in HDFS .. Adding alerts for webrequest data loss in HDFS Change-Id: If91ce8badded15a2d15e8a0be42735ebe80f5968 --- M manifests/misc/analytics.pp M manifests/misc/monitoring.pp M templates/icinga/checkcommands.cfg.erb 3 files changed, 28 insertions(+), 1 deletion(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/manifests/misc/analytics.pp b/manifests/misc/analytics.pp index 66655da..bea13d8 100644 --- a/manifests/misc/analytics.pp +++ b/manifests/misc/analytics.pp @@ -35,4 +35,4 @@ check_command => "check_kafka_broker_produce_requests!3!2", contact_group => "analytics", } -} \ No newline at end of file +} diff --git a/manifests/misc/monitoring.pp b/manifests/misc/monitoring.pp index 5f6c6c1..a2ec64c 100644 --- a/manifests/misc/monitoring.pp +++ b/manifests/misc/monitoring.pp @@ -58,6 +58,20 @@ source => "puppet:///files/ganglia/plugins/kraken_webrequest_loss.pyconf", notify => Service[gmond]; } + + # Set up icinga monitoring of Kraken HDFS data loss. + monitor_service { "kraken_webrequest_loss_average_positive": + description => "webrequest_loss_average_positive", + check_command => "check_kraken_webrequest_loss_positive!2!8", + contact_group => "analytics", + } + # It is possible to have negative data loss. This would mean that + # we are receiving duplicates log lines. We need alerts for this too. + monitor_service { "kraken_webrequest_loss_average_negative": + description => "webrequest_loss_average_negative", + check_command => "check_kraken_webrequest_loss_negative!-2!-8", + contact_group => "analytics", + } } # Ganglia views that should be diff --git a/templates/icinga/checkcommands.cfg.erb b/templates/icinga/checkcommands.cfg.erb index 830ba4a..b156f7e 100644 --- a/templates/icinga/checkcommands.cfg.erb +++ b/templates/icinga/checkcommands.cfg.erb @@ -621,4 +621,17 @@ command_line$USER1$/check_ganglios_generic_value -H $HOSTADDRESS$ -m kafka_network_SocketServerStats.ProduceRequestsPerSecond -w $ARG1$ -c $ARG2$ -o lt } +# Alerts for data loss in Kraken HDFS. +define command{ + command_namecheck_kraken_webrequest_loss_positive + command_line$USER1$/check_ganglios_generic_value -H $HOSTADDRESS$ -m webrequest_loss_average -w $ARG1$ -c $ARG2$ -o gt +} + +# Data loss percentage CAN be negative if we receive duplicate traffic +# (this has happened before). We need an extra alert if the percentages goes negative. +define command{ + command_namecheck_kraken_webrequest_loss_negative + command_line$USER1$/check_ganglios_generic_value -H $HOSTADDRESS$ -m webrequest_loss_average -w $ARG1$ -c $ARG2$ -o lt +} + -- To view, visit https://gerrit.wikimedia.org/r/66241 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: If91ce8badded15a2d15e8a0be42735ebe80f5968 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Adding classes to install hive, pig and sqoop. (very simple!) - change (operations...cdh4)
Ottomata has submitted this change and it was merged. Change subject: Adding classes to install hive, pig and sqoop. (very simple!) .. Adding classes to install hive, pig and sqoop. (very simple!) Hive Server and Hive Metastore puppetiztaion will come in a separate commit. Adding and fixing some tests. Change-Id: Iaa7769ce39e7002edb2b2476df3f850c611b4e6b --- M TODO.md A manifests/hive.pp A manifests/pig.pp A manifests/sqoop.pp A templates/pig/pig.properties.erb M tests/Makefile M tests/historyserver.pp A tests/hive.pp M tests/jobtracker.pp A tests/pig.pp M tests/resourcemanager.pp A tests/sqoop.pp 12 files changed, 119 insertions(+), 3 deletions(-) Approvals: Ori.livneh: Looks good to me, but someone else must approve Akosiaris: Looks good to me, approved jenkins-bot: Verified diff --git a/TODO.md b/TODO.md index 2fb0842..9dac694 100644 --- a/TODO.md +++ b/TODO.md @@ -11,4 +11,16 @@ - Make log4j.properties more configurable. - Support Secondary NameNode. - Support High Availability NameNode. -- Make JMX ports configurable. \ No newline at end of file +- Make JMX ports configurable. + +## Hive +- Hive Server + Hive Metastore + +## Oozie + +## Hue + +## HBase + +## Zookeeper + diff --git a/manifests/hive.pp b/manifests/hive.pp new file mode 100644 index 000..5bac968 --- /dev/null +++ b/manifests/hive.pp @@ -0,0 +1,10 @@ +# == Class cdh4::hive +# +# Installs Hive packages (needed for Hive Client). +# Use cdh4::hive::server to install and set up a Hive server. +# +class cdh4::hive { + package { 'hive': +ensure => 'installed', + } +} \ No newline at end of file diff --git a/manifests/pig.pp b/manifests/pig.pp new file mode 100644 index 000..a78243d --- /dev/null +++ b/manifests/pig.pp @@ -0,0 +1,14 @@ +# == Class cdh4::pig +# +# Installs and configures Apache Pig. +# +class cdh4::pig { + package { 'pig': +ensure => 'installed', + } + + file { '/etc/pig/conf/pig.properties': +content => template('cdh4/pig/pig.properties.erb'), +require => Package['pig'], + } +} diff --git a/manifests/sqoop.pp b/manifests/sqoop.pp new file mode 100644 index 000..bb93f19 --- /dev/null +++ b/manifests/sqoop.pp @@ -0,0 +1,16 @@ +# == Class cdh4::sqoop +# Installs Sqoop +class cdh4::sqoop { + package { ['sqoop', 'libmysql-java']: +ensure => 'installed', + } + + # symlink the mysql-connector-java.jar that is installed by + # libmysql-java into /usr/lib/sqoop/lib + + file { '/usr/lib/sqoop/lib/mysql-connector-java.jar': +ensure => 'link', +target => '/usr/share/java/mysql-connector-java.jar', +require => [Package['sqoop'], Package['libmysql-java']], + } +} \ No newline at end of file diff --git a/templates/pig/pig.properties.erb b/templates/pig/pig.properties.erb new file mode 100644 index 000..ea1ff44 --- /dev/null +++ b/templates/pig/pig.properties.erb @@ -0,0 +1,54 @@ +# Pig configuration file. All values can be overwritten by command line arguments. +# see bin/pig -help + +# log4jconf log4j configuration file +# log4jconf=./conf/log4j.properties + +# brief logging (no timestamps) +brief=false + +# clustername, name of the hadoop jobtracker. If no port is defined port 50020 will be used. +#cluster + +#debug level, INFO is default +debug=INFO + +# a file that contains pig script +#file= + +# load jarfile, colon separated +#jar= + +#verbose print all log messages to screen (default to print only INFO and above to screen) +verbose=false + +#exectype local|mapreduce, mapreduce is default +#exectype=mapreduce +# hod realted properties +#ssh.gateway +#hod.expect.root +#hod.expect.useensure => installed +#hod.command +#hod.config.dir +#hod.param + + +#Do not spill temp files smaller than this size (bytes) +pig.spill.size.threshold=500 +#EXPERIMENT: Activate garbage collection when spilling a file bigger than this size (bytes) +#This should help reduce the number of files being spilled. +pig.spill.gc.activation.size=4000 + + +## +# Everything below this line is Yahoo specific. Note that I've made +# (almost) no changes to the lines above to make merging in from Apache +# easier. Any values I don't want from above I override below. +# +# This file is configured for use with HOD on the production clusters. If you +# want to run pig with a static cluster you will need to remove everything +# below this line and set the cluster value (above) to the +# hostname and port of your job tracker. + +exectype=mapreduce +log.file= diff --git a/tests/Makefile b/tests/Makefile index d3fa013..b1acb3b 100644 --- a/tests/Makefile +++ b/tests/Makefile @@ -1,4 +1,4 @@ -MANIFESTS=datanode.po defaults.po hadoop.po historyserver.po jobtracker.po Makefile master.po namenode.po nodemanager.po resourcemanager.po tasktracker.po worker.po +MANIFESTS=datanode.po defaults.po hadoop.po historyserver.po hive.po jobtracker.po Makefile master.po namenode.po nodemanager.po pig.po resour
[MediaWiki-commits] [Gerrit] Updating modules/cdh4 to latest ecosystem commit. - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/66263 Change subject: Updating modules/cdh4 to latest ecosystem commit. .. Updating modules/cdh4 to latest ecosystem commit. This will allwo us to puppetize and reinstall the Kraken hadoop nodes. Change-Id: I6e0701c2de5f30d27f85c77e9c352e6bb5be6d49 --- M modules/cdh4 1 file changed, 0 insertions(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/63/66263/1 diff --git a/modules/cdh4 b/modules/cdh4 index 79452a3..a9ce1b1 16 --- a/modules/cdh4 +++ b/modules/cdh4 -Subproject commit 79452a332dc23b8400686f96bb45a18c18241796 +Subproject commit a9ce1b1f5ac29392c6679dd7952dd0fd8db37bd3 -- To view, visit https://gerrit.wikimedia.org/r/66263 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I6e0701c2de5f30d27f85c77e9c352e6bb5be6d49 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Updating modules/cdh4 to latest ecosystem commit. - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Updating modules/cdh4 to latest ecosystem commit. .. Updating modules/cdh4 to latest ecosystem commit. This will allwo us to puppetize and reinstall the Kraken hadoop nodes. Change-Id: I6e0701c2de5f30d27f85c77e9c352e6bb5be6d49 --- M modules/cdh4 1 file changed, 0 insertions(+), 0 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/modules/cdh4 b/modules/cdh4 index 79452a3..a9ce1b1 16 --- a/modules/cdh4 +++ b/modules/cdh4 -Subproject commit 79452a332dc23b8400686f96bb45a18c18241796 +Subproject commit a9ce1b1f5ac29392c6679dd7952dd0fd8db37bd3 -- To view, visit https://gerrit.wikimedia.org/r/66263 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I6e0701c2de5f30d27f85c77e9c352e6bb5be6d49 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Updating some hadoop mapreduce and yarn configs - change (operations...cdh4)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/66337 Change subject: Updating some hadoop mapreduce and yarn configs .. Updating some hadoop mapreduce and yarn configs Change-Id: I5239e8fea4515c2209e3c959e23b595c0be9ca50 --- M manifests/hadoop.pp M manifests/hadoop/defaults.pp M templates/hadoop/log4j.properties.erb M templates/hadoop/mapred-site.xml.erb M templates/hadoop/yarn-site.xml.erb 5 files changed, 118 insertions(+), 56 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet/cdh4 refs/changes/37/66337/1 diff --git a/manifests/hadoop.pp b/manifests/hadoop.pp index fa24400..57f26ed 100644 --- a/manifests/hadoop.pp +++ b/manifests/hadoop.pp @@ -26,33 +26,46 @@ # $io_file_buffer_size # $map_tasks_maximum # $reduce_tasks_maximum +# $mapreduce_job_reuse_jvm_num_tasks # $reduce_parallel_copies # $map_memory_mb -# $mapreduce_job_reuse_jvm_num_tasks +# $reduce_memory_mb +# $mapreduce_task_io_sort_mb +# $mapreduce_task_io_sort_factor +# $mapreduce_map_java_opts # $mapreduce_child_java_opts +# $yarn_nodemanager_resource_memory_mb +# $yarn_resourcemanager_scheduler_class - If you change this (e.g. to FairScheduler), you should also provide your own scheduler config .xml files outside of the cdh4 module. # $use_yarn # class cdh4::hadoop( $namenode_hostname, $dfs_name_dir, - $config_directory = $::cdh4::hadoop::defaults::config_directory, - $datanode_mounts = $::cdh4::hadoop::defaults::datanode_mounts, - $dfs_data_path = $::cdh4::hadoop::defaults::dfs_data_path, - $yarn_local_path = $::cdh4::hadoop::defaults::yarn_local_path, - $yarn_logs_path= $::cdh4::hadoop::defaults::yarn_logs_path, - $dfs_block_size= $::cdh4::hadoop::defaults::dfs_block_size, - $enable_jmxremote = $::cdh4::hadoop::defaults::enable_jmxremote, - $enable_webhdfs= $::cdh4::hadoop::defaults::enable_webhdfs, - $enable_intermediate_compression = $::cdh4::hadoop::defaults::enable_intermediate_compression, - $enable_final_compession = $::cdh4::hadoop::defaults::enable_final_compession, - $io_file_buffer_size = $::cdh4::hadoop::defaults::io_file_buffer_size, - $map_tasks_maximum = $::cdh4::hadoop::defaults::map_tasks_maximum, - $reduce_tasks_maximum = $::cdh4::hadoop::defaults::reduce_tasks_maximum, - $reduce_parallel_copies= $::cdh4::hadoop::defaults::reduce_parallel_copies, - $map_memory_mb = $::cdh4::hadoop::defaults::map_memory_mb, - $mapreduce_job_reuse_jvm_num_tasks = $::cdh4::hadoop::defaults::mapreduce_job_reuse_jvm_num_tasks, - $mapreduce_child_java_opts = $::cdh4::hadoop::defaults::mapreduce_child_java_opts, - $use_yarn = $::cdh4::hadoop::defaults::use_yarn + $config_directory= $::cdh4::hadoop::defaults::config_directory, + $datanode_mounts = $::cdh4::hadoop::defaults::datanode_mounts, + $dfs_data_path = $::cdh4::hadoop::defaults::dfs_data_path, + $yarn_local_path = $::cdh4::hadoop::defaults::yarn_local_path, + $yarn_logs_path = $::cdh4::hadoop::defaults::yarn_logs_path, + $dfs_block_size = $::cdh4::hadoop::defaults::dfs_block_size, + $enable_jmxremote= $::cdh4::hadoop::defaults::enable_jmxremote, + $enable_webhdfs = $::cdh4::hadoop::defaults::enable_webhdfs, + $enable_intermediate_compression = $::cdh4::hadoop::defaults::enable_intermediate_compression, + $enable_final_compession = $::cdh4::hadoop::defaults::enable_final_compession, + $io_file_buffer_size = $::cdh4::hadoop::defaults::io_file_buffer_size, + $mapreduce_map_tasks_maximum = $::cdh4::hadoop::defaults::mapreduce_map_tasks_maximum, + $mapreduce_reduce_tasks_maximum = $::cdh4::hadoop::defaults::mapreduce_reduce_tasks_maximum, + $mapreduce_job_reuse_jvm_num_tasks = $::cdh4::hadoop::defaults::mapreduce_job_reuse_jvm_num_tasks, + $mapreduce_reduce_shuffle_parallelcopies = $::cdh4::hadoop::defaults::mapreduce_reduce_shuffle_parallelcopies, + $mapreduce_map_memory_mb = $::cdh4::hadoop::defaults::mapreduce_map_memory_mb, + $mapreduce_reduce_memory_mb = $::cdh4::hadoop::defaults::mapreduce_reduce_memory_mb, + $mapreduce_task_io_sort_mb = $::cdh4::hadoop::defaults::mapreduce_task_io_sort_mb, + $mapreduce_task_io_sort_factor = $::cdh4::hadoop::defaults::mapreduce_task_io_sort_factor, + $mapreduce_map_java_opts = $::cdh4::hadoop::defaults::mapreduc
[MediaWiki-commits] [Gerrit] Updating some hadoop mapreduce and yarn configs - change (operations...cdh4)
Ottomata has submitted this change and it was merged. Change subject: Updating some hadoop mapreduce and yarn configs .. Updating some hadoop mapreduce and yarn configs Change-Id: I5239e8fea4515c2209e3c959e23b595c0be9ca50 --- M manifests/hadoop.pp M manifests/hadoop/defaults.pp M templates/hadoop/log4j.properties.erb M templates/hadoop/mapred-site.xml.erb M templates/hadoop/yarn-site.xml.erb 5 files changed, 118 insertions(+), 56 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/manifests/hadoop.pp b/manifests/hadoop.pp index fa24400..57f26ed 100644 --- a/manifests/hadoop.pp +++ b/manifests/hadoop.pp @@ -26,33 +26,46 @@ # $io_file_buffer_size # $map_tasks_maximum # $reduce_tasks_maximum +# $mapreduce_job_reuse_jvm_num_tasks # $reduce_parallel_copies # $map_memory_mb -# $mapreduce_job_reuse_jvm_num_tasks +# $reduce_memory_mb +# $mapreduce_task_io_sort_mb +# $mapreduce_task_io_sort_factor +# $mapreduce_map_java_opts # $mapreduce_child_java_opts +# $yarn_nodemanager_resource_memory_mb +# $yarn_resourcemanager_scheduler_class - If you change this (e.g. to FairScheduler), you should also provide your own scheduler config .xml files outside of the cdh4 module. # $use_yarn # class cdh4::hadoop( $namenode_hostname, $dfs_name_dir, - $config_directory = $::cdh4::hadoop::defaults::config_directory, - $datanode_mounts = $::cdh4::hadoop::defaults::datanode_mounts, - $dfs_data_path = $::cdh4::hadoop::defaults::dfs_data_path, - $yarn_local_path = $::cdh4::hadoop::defaults::yarn_local_path, - $yarn_logs_path= $::cdh4::hadoop::defaults::yarn_logs_path, - $dfs_block_size= $::cdh4::hadoop::defaults::dfs_block_size, - $enable_jmxremote = $::cdh4::hadoop::defaults::enable_jmxremote, - $enable_webhdfs= $::cdh4::hadoop::defaults::enable_webhdfs, - $enable_intermediate_compression = $::cdh4::hadoop::defaults::enable_intermediate_compression, - $enable_final_compession = $::cdh4::hadoop::defaults::enable_final_compession, - $io_file_buffer_size = $::cdh4::hadoop::defaults::io_file_buffer_size, - $map_tasks_maximum = $::cdh4::hadoop::defaults::map_tasks_maximum, - $reduce_tasks_maximum = $::cdh4::hadoop::defaults::reduce_tasks_maximum, - $reduce_parallel_copies= $::cdh4::hadoop::defaults::reduce_parallel_copies, - $map_memory_mb = $::cdh4::hadoop::defaults::map_memory_mb, - $mapreduce_job_reuse_jvm_num_tasks = $::cdh4::hadoop::defaults::mapreduce_job_reuse_jvm_num_tasks, - $mapreduce_child_java_opts = $::cdh4::hadoop::defaults::mapreduce_child_java_opts, - $use_yarn = $::cdh4::hadoop::defaults::use_yarn + $config_directory= $::cdh4::hadoop::defaults::config_directory, + $datanode_mounts = $::cdh4::hadoop::defaults::datanode_mounts, + $dfs_data_path = $::cdh4::hadoop::defaults::dfs_data_path, + $yarn_local_path = $::cdh4::hadoop::defaults::yarn_local_path, + $yarn_logs_path = $::cdh4::hadoop::defaults::yarn_logs_path, + $dfs_block_size = $::cdh4::hadoop::defaults::dfs_block_size, + $enable_jmxremote= $::cdh4::hadoop::defaults::enable_jmxremote, + $enable_webhdfs = $::cdh4::hadoop::defaults::enable_webhdfs, + $enable_intermediate_compression = $::cdh4::hadoop::defaults::enable_intermediate_compression, + $enable_final_compession = $::cdh4::hadoop::defaults::enable_final_compession, + $io_file_buffer_size = $::cdh4::hadoop::defaults::io_file_buffer_size, + $mapreduce_map_tasks_maximum = $::cdh4::hadoop::defaults::mapreduce_map_tasks_maximum, + $mapreduce_reduce_tasks_maximum = $::cdh4::hadoop::defaults::mapreduce_reduce_tasks_maximum, + $mapreduce_job_reuse_jvm_num_tasks = $::cdh4::hadoop::defaults::mapreduce_job_reuse_jvm_num_tasks, + $mapreduce_reduce_shuffle_parallelcopies = $::cdh4::hadoop::defaults::mapreduce_reduce_shuffle_parallelcopies, + $mapreduce_map_memory_mb = $::cdh4::hadoop::defaults::mapreduce_map_memory_mb, + $mapreduce_reduce_memory_mb = $::cdh4::hadoop::defaults::mapreduce_reduce_memory_mb, + $mapreduce_task_io_sort_mb = $::cdh4::hadoop::defaults::mapreduce_task_io_sort_mb, + $mapreduce_task_io_sort_factor = $::cdh4::hadoop::defaults::mapreduce_task_io_sort_factor, + $mapreduce_map_java_opts = $::cdh4::hadoop::defaults::mapreduce_map_java_opts, + $mapreduce_reduce_ja
[MediaWiki-commits] [Gerrit] Puppetizing analytics1020 with roles/hadoop.pp! - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/66553 Change subject: Puppetizing analytics1020 with roles/hadoop.pp! .. Puppetizing analytics1020 with roles/hadoop.pp! Change-Id: I17f5af2acf0b0fd1313450d84e264222cf0d5296 --- A manifests/role/hadoop.pp M manifests/site.pp 2 files changed, 160 insertions(+), 2 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/53/66553/1 diff --git a/manifests/role/hadoop.pp b/manifests/role/hadoop.pp new file mode 100644 index 000..62e34d2 --- /dev/null +++ b/manifests/role/hadoop.pp @@ -0,0 +1,150 @@ +# roles/hadoop.pp +# +# Role classes for Hadoop nodes. +# These role classes will configure hadoop properly in either +# the Analytics labs or Analytics production environments. + +# +# Usage: +# +# To install only hadoop client packages and configs: +# include role::hadoop +# +# To install a Hadoop Master (NameNode + ResourceManager, etc.): +# include role::hadoop::master +# +# To install a Hadoop Worker (DataNode + NodeManager + etc.): +# include role::hadoop::worker +# + + + +# == Class role::hadoop +# Installs base configs for Hadoop nodes +# +class role::hadoop { + # include common labs or production hadoop configs + # based on $::realm + if ($::realm == 'labs') { +include role::hadoop::labs + } + else { +include role::hadoop::production + } +} + +# == Class role::hadoop::master +# Includes cdh4::hadoop::master classes +# +class role::hadoop::master inherits role::hadoop { + system_role { "role::hadoop::master": description => "Hadoop Master (NameNode & ResourceManager)" } + include cdh4::hadoop::master +} + +# == Class role::hadoop::worker +# Includes cdh4::hadoop::worker classes +class role::hadoop::worker inherits role::hadoop { + system_role { "role::hadoop::worker": description => "Hadoop Worker (DataNode & NodeManager)" } + include cdh4::hadoop::worker +} + + +# == Class role::hadoop::production +# Common hadoop configs for the production Kraken cluster +# +class role::hadoop::production { + $namenode_hostname= "analytics1010.eqiad.wmnet" + $hadoop_name_directory= "/var/lib/hadoop/name" + + $hadoop_data_directory= "/var/lib/hadoop/data" + $datanode_mounts = [ +"$hadoop_data_directory/c", +"$hadoop_data_directory/d", +"$hadoop_data_directory/e", +"$hadoop_data_directory/f", +"$hadoop_data_directory/g", +"$hadoop_data_directory/h", +"$hadoop_data_directory/i", +"$hadoop_data_directory/j", +"$hadoop_data_directory/k", +"$hadoop_data_directory/l" + ] + + class { 'cdh4::hadoop': +namenode_hostname => $namenode_hostname, +datanode_mounts => $datanode_mounts, +dfs_name_dir=> [$hadoop_name_directory], +dfs_block_size => 268435456, # 256 MB +io_file_buffer_size => 131072, +mapreduce_map_tasks_maximum => ($processorcount - 2) / 2, +mapreduce_reduce_tasks_maximum => ($processorcount - 2) / 2, +mapreduce_job_reuse_jvm_num_tasks => 1, +mapreduce_map_memory_mb => 1536, +mapreduce_reduce_memory_mb => 3072, +mapreduce_map_java_opts => '-Xmx1024M', +mapreduce_reduce_java_opts => '-Xmx2560M', +mapreduce_reduce_shuffle_parallelcopies => 10, +mapreduce_task_io_sort_mb => 200, +mapreduce_task_io_sort_factor => 10, +yarn_nodemanager_resource_memory_mb => 40960, +yarn_resourcemanager_scheduler_class=> 'org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler', + } + + file { "$::cdh4::hadoop::config_directory/fair-scheduler.xml": +content => template('kraken/fair-scheduler.xml.erb'), +require => Class['cdh4::hadoop'], + } + file { "$::cdh4::hadoop::config_directory/fair-scheduler-allocation.xml": +content => template('kraken/fair-scheduler-allocation.xml.erb'), +require => Class['cdh4::hadoop'], + } + + include cdh4::hive + include cdh4::pig + include cdh4::sqoop +} + + + + + +# == Class role::hadoop::labs +# Common hadoop configs for the labs Kraken cluster +# +class role::hadoop::labs { + $namenode_hostname= "kraken0.pmtpa.wmflabs" + $hadoop_name_directory= "/var/lib/hadoop/name" + + $hadoop_data_directory= "/var/lib/hadoop/data" + $datanode_mounts = [ +"$hadoop_data_directory/a", +"$hadoop_data_directory/b", + ] + + class { 'cdh4::hadoop': +namenode_hostname => $namenode_hostname, +datanode_mounts => $datanode_mounts, +dfs_name_dir=> [$hadoop_name_directory], +dfs_block_size => 268435456, # 256 MB +io_file_buffer_size => 131072, +mapreduc
[MediaWiki-commits] [Gerrit] Adding role/hadoop.pp classes - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Adding role/hadoop.pp classes .. Adding role/hadoop.pp classes Change-Id: I17f5af2acf0b0fd1313450d84e264222cf0d5296 --- A manifests/role/hadoop.pp M manifests/site.pp A templates/hadoop/fair-scheduler-allocation.xml.erb A templates/hadoop/fair-scheduler.xml.erb 4 files changed, 197 insertions(+), 1 deletion(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/manifests/role/hadoop.pp b/manifests/role/hadoop.pp new file mode 100644 index 000..2a141d5 --- /dev/null +++ b/manifests/role/hadoop.pp @@ -0,0 +1,150 @@ +# role/hadoop.pp +# +# Role classes for Hadoop nodes. +# These role classes will configure hadoop properly in either +# the Analytics labs or Analytics production environments. + +# +# Usage: +# +# To install only hadoop client packages and configs: +# include role::hadoop +# +# To install a Hadoop Master (NameNode + ResourceManager, etc.): +# include role::hadoop::master +# +# To install a Hadoop Worker (DataNode + NodeManager + etc.): +# include role::hadoop::worker +# + + + +# == Class role::hadoop +# Installs base configs for Hadoop nodes +# +class role::hadoop { + # include common labs or production hadoop configs + # based on $::realm + if ($::realm == 'labs') { +include role::hadoop::labs + } + else { +include role::hadoop::production + } +} + +# == Class role::hadoop::master +# Includes cdh4::hadoop::master classes +# +class role::hadoop::master inherits role::hadoop { + system_role { "role::hadoop::master": description => "Hadoop Master (NameNode & ResourceManager)" } + include cdh4::hadoop::master +} + +# == Class role::hadoop::worker +# Includes cdh4::hadoop::worker classes +class role::hadoop::worker inherits role::hadoop { + system_role { "role::hadoop::worker": description => "Hadoop Worker (DataNode & NodeManager)" } + include cdh4::hadoop::worker +} + + +# == Class role::hadoop::production +# Common hadoop configs for the production Kraken cluster +# +class role::hadoop::production { + $namenode_hostname= "analytics1010.eqiad.wmnet" + $hadoop_name_directory= "/var/lib/hadoop/name" + + $hadoop_data_directory= "/var/lib/hadoop/data" + $datanode_mounts = [ +"$hadoop_data_directory/c", +"$hadoop_data_directory/d", +"$hadoop_data_directory/e", +"$hadoop_data_directory/f", +"$hadoop_data_directory/g", +"$hadoop_data_directory/h", +"$hadoop_data_directory/i", +"$hadoop_data_directory/j", +"$hadoop_data_directory/k", +"$hadoop_data_directory/l" + ] + + class { 'cdh4::hadoop': +namenode_hostname => $namenode_hostname, +datanode_mounts => $datanode_mounts, +dfs_name_dir=> [$hadoop_name_directory], +dfs_block_size => 268435456, # 256 MB +io_file_buffer_size => 131072, +mapreduce_map_tasks_maximum => ($::processorcount - 2) / 2, +mapreduce_reduce_tasks_maximum => ($::processorcount - 2) / 2, +mapreduce_job_reuse_jvm_num_tasks => 1, +mapreduce_map_memory_mb => 1536, +mapreduce_reduce_memory_mb => 3072, +mapreduce_map_java_opts => '-Xmx1024M', +mapreduce_reduce_java_opts => '-Xmx2560M', +mapreduce_reduce_shuffle_parallelcopies => 10, +mapreduce_task_io_sort_mb => 200, +mapreduce_task_io_sort_factor => 10, +yarn_nodemanager_resource_memory_mb => 40960, +yarn_resourcemanager_scheduler_class=> 'org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler', + } + + file { "$::cdh4::hadoop::config_directory/fair-scheduler.xml": +content => template('hadoop/fair-scheduler.xml.erb'), +require => Class['cdh4::hadoop'], + } + file { "$::cdh4::hadoop::config_directory/fair-scheduler-allocation.xml": +content => template('hadoop/fair-scheduler-allocation.xml.erb'), +require => Class['cdh4::hadoop'], + } + + include cdh4::hive + include cdh4::pig + include cdh4::sqoop +} + + + + + +# == Class role::hadoop::labs +# Common hadoop configs for the labs Kraken cluster +# +class role::hadoop::labs { + $namenode_hostname= "kraken0.pmtpa.wmflabs" + $hadoop_name_directory= "/var/lib/hadoop/name" + + $hadoop_data_directory= "/var/lib/hadoop/data" + $datanode_mounts = [ +"$hadoop_data_directory/a", +"$hadoop_data_directory/b", + ] + + class { 'cdh4::hadoop': +namenode_hostname => $namenode_hostname, +datanode_mounts => $datanode_mounts, +dfs_name_dir=> [$hadoop_name_directory], +dfs_block_size => 268435456, # 256 MB +io_file_buffer_size
[MediaWiki-commits] [Gerrit] 2 -> 4 spaces - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/66871 Change subject: 2 -> 4 spaces .. 2 -> 4 spaces Change-Id: I736f8363dae16f3b36f0ee87bb4edd37e592af31 --- M manifests/role/hadoop.pp 1 file changed, 86 insertions(+), 87 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/71/66871/1 diff --git a/manifests/role/hadoop.pp b/manifests/role/hadoop.pp index 2a141d5..d4b73c2 100644 --- a/manifests/role/hadoop.pp +++ b/manifests/role/hadoop.pp @@ -18,34 +18,33 @@ # - # == Class role::hadoop # Installs base configs for Hadoop nodes # class role::hadoop { - # include common labs or production hadoop configs - # based on $::realm - if ($::realm == 'labs') { -include role::hadoop::labs - } - else { -include role::hadoop::production - } +# include common labs or production hadoop configs +# based on $::realm +if ($::realm == 'labs') { +include role::hadoop::labs +} +else { +include role::hadoop::production +} } # == Class role::hadoop::master # Includes cdh4::hadoop::master classes # class role::hadoop::master inherits role::hadoop { - system_role { "role::hadoop::master": description => "Hadoop Master (NameNode & ResourceManager)" } - include cdh4::hadoop::master +system_role { "role::hadoop::master": description => "Hadoop Master (NameNode & ResourceManager)" } +include cdh4::hadoop::master } # == Class role::hadoop::worker # Includes cdh4::hadoop::worker classes class role::hadoop::worker inherits role::hadoop { - system_role { "role::hadoop::worker": description => "Hadoop Worker (DataNode & NodeManager)" } - include cdh4::hadoop::worker +system_role { "role::hadoop::worker": description => "Hadoop Worker (DataNode & NodeManager)" } +include cdh4::hadoop::worker } @@ -53,55 +52,55 @@ # Common hadoop configs for the production Kraken cluster # class role::hadoop::production { - $namenode_hostname= "analytics1010.eqiad.wmnet" - $hadoop_name_directory= "/var/lib/hadoop/name" +$namenode_hostname= "analytics1010.eqiad.wmnet" +$hadoop_name_directory= "/var/lib/hadoop/name" - $hadoop_data_directory= "/var/lib/hadoop/data" - $datanode_mounts = [ -"$hadoop_data_directory/c", -"$hadoop_data_directory/d", -"$hadoop_data_directory/e", -"$hadoop_data_directory/f", -"$hadoop_data_directory/g", -"$hadoop_data_directory/h", -"$hadoop_data_directory/i", -"$hadoop_data_directory/j", -"$hadoop_data_directory/k", -"$hadoop_data_directory/l" - ] +$hadoop_data_directory= "/var/lib/hadoop/data" +$datanode_mounts = [ +"$hadoop_data_directory/c", +"$hadoop_data_directory/d", +"$hadoop_data_directory/e", +"$hadoop_data_directory/f", +"$hadoop_data_directory/g", +"$hadoop_data_directory/h", +"$hadoop_data_directory/i", +"$hadoop_data_directory/j", +"$hadoop_data_directory/k", +"$hadoop_data_directory/l" +] - class { 'cdh4::hadoop': -namenode_hostname => $namenode_hostname, -datanode_mounts => $datanode_mounts, -dfs_name_dir=> [$hadoop_name_directory], -dfs_block_size => 268435456, # 256 MB -io_file_buffer_size => 131072, -mapreduce_map_tasks_maximum => ($::processorcount - 2) / 2, -mapreduce_reduce_tasks_maximum => ($::processorcount - 2) / 2, -mapreduce_job_reuse_jvm_num_tasks => 1, -mapreduce_map_memory_mb => 1536, -mapreduce_reduce_memory_mb => 3072, -mapreduce_map_java_opts => '-Xmx1024M', -mapreduce_reduce_java_opts => '-Xmx2560M', -mapreduce_reduce_shuffle_parallelcopies => 10, -mapreduce_task_io_sort_mb => 200, -mapreduce_task_io_sort_factor => 10, -yarn_nodemanager_resource_memory_mb => 40960, -yarn_resourcemanager_scheduler_class=> 'org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler', - } +class { 'cdh4::hadoop': +namenode_hostname => $namenode_hostname, +datanode_mounts => $datanode_mounts, +dfs_name_dir=> [$hadoop_name_directory], +dfs_block_size => 268435456, # 256 MB +io_file_buffer_size => 131072, +mapreduce_map_tasks_maximum => ($::processorcount - 2) / 2, +mapreduce_reduce_tasks_maximum => ($::processorcount - 2) / 2, +mapreduce_job_reuse_jvm_num_tasks => 1, +mapreduce_map_memory_mb => 1536, +mapreduce_reduce_memory_mb
[MediaWiki-commits] [Gerrit] 2 -> 4 spaces - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: 2 -> 4 spaces .. 2 -> 4 spaces Change-Id: I736f8363dae16f3b36f0ee87bb4edd37e592af31 --- M manifests/role/hadoop.pp 1 file changed, 86 insertions(+), 87 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/manifests/role/hadoop.pp b/manifests/role/hadoop.pp index 2a141d5..d4b73c2 100644 --- a/manifests/role/hadoop.pp +++ b/manifests/role/hadoop.pp @@ -18,34 +18,33 @@ # - # == Class role::hadoop # Installs base configs for Hadoop nodes # class role::hadoop { - # include common labs or production hadoop configs - # based on $::realm - if ($::realm == 'labs') { -include role::hadoop::labs - } - else { -include role::hadoop::production - } +# include common labs or production hadoop configs +# based on $::realm +if ($::realm == 'labs') { +include role::hadoop::labs +} +else { +include role::hadoop::production +} } # == Class role::hadoop::master # Includes cdh4::hadoop::master classes # class role::hadoop::master inherits role::hadoop { - system_role { "role::hadoop::master": description => "Hadoop Master (NameNode & ResourceManager)" } - include cdh4::hadoop::master +system_role { "role::hadoop::master": description => "Hadoop Master (NameNode & ResourceManager)" } +include cdh4::hadoop::master } # == Class role::hadoop::worker # Includes cdh4::hadoop::worker classes class role::hadoop::worker inherits role::hadoop { - system_role { "role::hadoop::worker": description => "Hadoop Worker (DataNode & NodeManager)" } - include cdh4::hadoop::worker +system_role { "role::hadoop::worker": description => "Hadoop Worker (DataNode & NodeManager)" } +include cdh4::hadoop::worker } @@ -53,55 +52,55 @@ # Common hadoop configs for the production Kraken cluster # class role::hadoop::production { - $namenode_hostname= "analytics1010.eqiad.wmnet" - $hadoop_name_directory= "/var/lib/hadoop/name" +$namenode_hostname= "analytics1010.eqiad.wmnet" +$hadoop_name_directory= "/var/lib/hadoop/name" - $hadoop_data_directory= "/var/lib/hadoop/data" - $datanode_mounts = [ -"$hadoop_data_directory/c", -"$hadoop_data_directory/d", -"$hadoop_data_directory/e", -"$hadoop_data_directory/f", -"$hadoop_data_directory/g", -"$hadoop_data_directory/h", -"$hadoop_data_directory/i", -"$hadoop_data_directory/j", -"$hadoop_data_directory/k", -"$hadoop_data_directory/l" - ] +$hadoop_data_directory= "/var/lib/hadoop/data" +$datanode_mounts = [ +"$hadoop_data_directory/c", +"$hadoop_data_directory/d", +"$hadoop_data_directory/e", +"$hadoop_data_directory/f", +"$hadoop_data_directory/g", +"$hadoop_data_directory/h", +"$hadoop_data_directory/i", +"$hadoop_data_directory/j", +"$hadoop_data_directory/k", +"$hadoop_data_directory/l" +] - class { 'cdh4::hadoop': -namenode_hostname => $namenode_hostname, -datanode_mounts => $datanode_mounts, -dfs_name_dir=> [$hadoop_name_directory], -dfs_block_size => 268435456, # 256 MB -io_file_buffer_size => 131072, -mapreduce_map_tasks_maximum => ($::processorcount - 2) / 2, -mapreduce_reduce_tasks_maximum => ($::processorcount - 2) / 2, -mapreduce_job_reuse_jvm_num_tasks => 1, -mapreduce_map_memory_mb => 1536, -mapreduce_reduce_memory_mb => 3072, -mapreduce_map_java_opts => '-Xmx1024M', -mapreduce_reduce_java_opts => '-Xmx2560M', -mapreduce_reduce_shuffle_parallelcopies => 10, -mapreduce_task_io_sort_mb => 200, -mapreduce_task_io_sort_factor => 10, -yarn_nodemanager_resource_memory_mb => 40960, -yarn_resourcemanager_scheduler_class=> 'org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler', - } +class { 'cdh4::hadoop': +namenode_hostname => $namenode_hostname, +datanode_mounts => $datanode_mounts, +dfs_name_dir=> [$hadoop_name_directory], +dfs_block_size => 268435456, # 256 MB +io_file_buffer_size => 131072, +mapreduce_map_tasks_maximum => ($::processorcount - 2) / 2, +mapreduce_reduce_tasks_maximum => ($::processorcount - 2) / 2, +mapreduce_job_reuse_jvm_num_tasks => 1, +mapreduce_map_memory_mb => 1536, +mapreduce_reduce_memory_mb => 3072, +mapreduce_map_j
[MediaWiki-commits] [Gerrit] Updating cdh4 module to latest merged change. - change (operations/puppet)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/66872 Change subject: Updating cdh4 module to latest merged change. .. Updating cdh4 module to latest merged change. Change-Id: If9222fc5c60dea7e408c69cc2263e86349875f93 --- M modules/cdh4 1 file changed, 0 insertions(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/72/66872/1 diff --git a/modules/cdh4 b/modules/cdh4 index a9ce1b1..b5f7b70 16 --- a/modules/cdh4 +++ b/modules/cdh4 -Subproject commit a9ce1b1f5ac29392c6679dd7952dd0fd8db37bd3 +Subproject commit b5f7b70b2191528e9df2a81d2361e1a1fdf2469b -- To view, visit https://gerrit.wikimedia.org/r/66872 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: If9222fc5c60dea7e408c69cc2263e86349875f93 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Updating cdh4 module to latest merged change. - change (operations/puppet)
Ottomata has submitted this change and it was merged. Change subject: Updating cdh4 module to latest merged change. .. Updating cdh4 module to latest merged change. Change-Id: If9222fc5c60dea7e408c69cc2263e86349875f93 --- M modules/cdh4 1 file changed, 0 insertions(+), 0 deletions(-) Approvals: Ottomata: Verified; Looks good to me, approved jenkins-bot: Verified diff --git a/modules/cdh4 b/modules/cdh4 index a9ce1b1..b5f7b70 16 --- a/modules/cdh4 +++ b/modules/cdh4 -Subproject commit a9ce1b1f5ac29392c6679dd7952dd0fd8db37bd3 +Subproject commit b5f7b70b2191528e9df2a81d2361e1a1fdf2469b -- To view, visit https://gerrit.wikimedia.org/r/66872 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: If9222fc5c60dea7e408c69cc2263e86349875f93 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ottomata Gerrit-Reviewer: Ottomata Gerrit-Reviewer: jenkins-bot ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] Moving compression configs to mapred-site.xml where they belong - change (operations...cdh4)
Ottomata has uploaded a new change for review. https://gerrit.wikimedia.org/r/66869 Change subject: Moving compression configs to mapred-site.xml where they belong .. Moving compression configs to mapred-site.xml where they belong Change-Id: I4adbf4ca20a194f9beaa8a7fb407c151e6809a0f --- M manifests/hadoop.pp M manifests/hadoop/defaults.pp M templates/hadoop/mapred-site.xml.erb M templates/hadoop/yarn-site.xml.erb 4 files changed, 35 insertions(+), 35 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet/cdh4 refs/changes/69/66869/1 diff --git a/manifests/hadoop.pp b/manifests/hadoop.pp index 57f26ed..49562a0 100644 --- a/manifests/hadoop.pp +++ b/manifests/hadoop.pp @@ -21,8 +21,6 @@ # $yarn_local_path - path relative to JBOD mount point for yarn local directories. # $yarn_logs_path - path relative to JBOD mount point for yarn log directories. # $dfs_block_size - HDFS block size in bytes. Default 64MB. -# $enable_intermediate_compression - If true, intermediate MapReduce data will be compressed with Snappy. -# $enable_final_compession - If true, Final output of MapReduce jobs will be compressed with Snappy. # $io_file_buffer_size # $map_tasks_maximum # $reduce_tasks_maximum @@ -34,6 +32,8 @@ # $mapreduce_task_io_sort_factor # $mapreduce_map_java_opts # $mapreduce_child_java_opts +# $mapreduce_intermediate_compression - If true, intermediate MapReduce data will be compressed with Snappy.Default: true. +# $mapreduce_final_compession - If true, Final output of MapReduce jobs will be compressed with Snappy. Default: false. # $yarn_nodemanager_resource_memory_mb # $yarn_resourcemanager_scheduler_class - If you change this (e.g. to FairScheduler), you should also provide your own scheduler config .xml files outside of the cdh4 module. # $use_yarn @@ -49,8 +49,6 @@ $dfs_block_size = $::cdh4::hadoop::defaults::dfs_block_size, $enable_jmxremote= $::cdh4::hadoop::defaults::enable_jmxremote, $enable_webhdfs = $::cdh4::hadoop::defaults::enable_webhdfs, - $enable_intermediate_compression = $::cdh4::hadoop::defaults::enable_intermediate_compression, - $enable_final_compession = $::cdh4::hadoop::defaults::enable_final_compession, $io_file_buffer_size = $::cdh4::hadoop::defaults::io_file_buffer_size, $mapreduce_map_tasks_maximum = $::cdh4::hadoop::defaults::mapreduce_map_tasks_maximum, $mapreduce_reduce_tasks_maximum = $::cdh4::hadoop::defaults::mapreduce_reduce_tasks_maximum, @@ -62,6 +60,8 @@ $mapreduce_task_io_sort_factor = $::cdh4::hadoop::defaults::mapreduce_task_io_sort_factor, $mapreduce_map_java_opts = $::cdh4::hadoop::defaults::mapreduce_map_java_opts, $mapreduce_reduce_java_opts = $::cdh4::hadoop::defaults::mapreduce_reduce_java_opts, + $mapreduce_intermediate_compression = $::cdh4::hadoop::defaults::mapreduce_intermediate_compression, + $mapreduce_final_compession = $::cdh4::hadoop::defaults::mapreduce_final_compession, $yarn_nodemanager_resource_memory_mb = $::cdh4::hadoop::defaults::yarn_nodemanager_resource_memory_mb, $yarn_resourcemanager_scheduler_class= $::cdh4::hadoop::defaults::yarn_resourcemanager_scheduler_class, $use_yarn= $::cdh4::hadoop::defaults::use_yarn diff --git a/manifests/hadoop/defaults.pp b/manifests/hadoop/defaults.pp index 5ad57aa..3fc6c5d 100644 --- a/manifests/hadoop/defaults.pp +++ b/manifests/hadoop/defaults.pp @@ -10,8 +10,6 @@ $dfs_block_size = 67108864 # 64MB default $enable_jmxremote= true $enable_webhdfs = true - $enable_intermediate_compression = true - $enable_final_compession = false $io_file_buffer_size = undef $mapreduce_map_tasks_maximum = undef $mapreduce_reduce_tasks_maximum = undef @@ -23,6 +21,8 @@ $mapreduce_task_io_sort_factor = undef $mapreduce_map_java_opts = undef $mapreduce_reduce_java_opts = undef + $mapreduce_intermediate_compression = true + $mapreduce_final_compression = false $yarn_nodemanager_resource_memory_mb = undef $yarn_resourcemanager_scheduler_class= undef $use_yarn= true diff --git a/templates/hadoop/mapred-site.xml.erb b/templates/hadoop/mapred-site.xml.erb index 137c43a..1dba226 100644 --- a/templates/hadoop/mapred-site.xml.erb +++ b/templates/hadoop/mapred-site.xml.erb @@ -115,6 +115,30 @@ <% end -%> + + +mapreduce.map.output.compress +<%= mapreduce_intermediate_compression %> +