[MediaWiki-commits] [Gerrit] Can't use the same config file for eventlogging and webrequest - change (analytics/kraken)

2014-04-29 Thread Ottomata (Code Review)
Ottomata has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/130346

Change subject: Can't use the same config file for eventlogging and webrequest
..

Can't use the same config file for eventlogging and webrequest

Duh!  They have different timestamps, don't know why I did this yesterday.

Change-Id: I05d77dae1b9f71ac2cb9ce1f331fec11a73fe760
---
C kraken-etl/conf/camus.eventlogging.properties
R kraken-etl/conf/camus.webrequest.properties
2 files changed, 10 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/analytics/kraken 
refs/changes/46/130346/1

diff --git a/kraken-etl/conf/camus.properties 
b/kraken-etl/conf/camus.eventlogging.properties
similarity index 92%
copy from kraken-etl/conf/camus.properties
copy to kraken-etl/conf/camus.eventlogging.properties
index 02e2e4e..f5854e8 100644
--- a/kraken-etl/conf/camus.properties
+++ b/kraken-etl/conf/camus.eventlogging.properties
@@ -7,11 +7,11 @@
 # where completed Camus job output directories are kept, usually a sub-dir in 
the base.path
 etl.execution.history.path=hdfs://kraken/wmf/camus/history
 
-# Our timestamps look like 2013-09-20T15:40:17
-camus.message.timestamp.format=-MM-dd'T'HH:mm:ss
+# Eventlogging timestamps are in unix epoch seconds
+camus.message.timestamp.format=unix
 
 # use the dt field
-camus.message.timestamp.field=dt
+camus.message.timestamp.field=timestamp
 
 # Store output into hourly buckets
 etl.output.file.time.partition.mins=60
@@ -33,7 +33,7 @@
 # Max hadoop tasks to use, each task can pull multiple topic partitions.
 # Currently importing mobile, bits, and eventlogging each with 10 partitions.  
Setting
 # map.tasks to 40.
-mapred.map.tasks=40
+mapred.map.tasks=10
 
 # Connection parameters.
 kafka.brokers=analytics1021.eqiad.wmnet:9092,analytics1022.eqiad.wmnet:9092
@@ -56,10 +56,10 @@
 kafka.blacklist.topics=
 
 # For now, we are only importing webrequest_mobile logs.
-kafka.whitelist.topics=webrequest_mobile,webrequest_bits,webrequest_text,eventlogging-00
+kafka.whitelist.topics=eventlogging-00
 
 # Name of the client as seen by kafka
-kafka.client.name=camus-webrequest-01
+kafka.client.name=camus-eventlogging-01
 
 # Fetch Request Parameters
 #kafka.fetch.buffer.size=
diff --git a/kraken-etl/conf/camus.properties 
b/kraken-etl/conf/camus.webrequest.properties
similarity index 95%
rename from kraken-etl/conf/camus.properties
rename to kraken-etl/conf/camus.webrequest.properties
index 02e2e4e..40c83ef 100644
--- a/kraken-etl/conf/camus.properties
+++ b/kraken-etl/conf/camus.webrequest.properties
@@ -31,9 +31,9 @@
 
etl.record.writer.provider.class=com.linkedin.camus.etl.kafka.common.SequenceFileRecordWriterProvider
 
 # Max hadoop tasks to use, each task can pull multiple topic partitions.
-# Currently importing mobile, bits, and eventlogging each with 10 partitions.  
Setting
-# map.tasks to 40.
-mapred.map.tasks=40
+# Currently importing mobile, and bits each with 10 partitions.  Setting
+# map.tasks to 30.
+mapred.map.tasks=30
 
 # Connection parameters.
 kafka.brokers=analytics1021.eqiad.wmnet:9092,analytics1022.eqiad.wmnet:9092
@@ -56,7 +56,7 @@
 kafka.blacklist.topics=
 
 # For now, we are only importing webrequest_mobile logs.
-kafka.whitelist.topics=webrequest_mobile,webrequest_bits,webrequest_text,eventlogging-00
+kafka.whitelist.topics=webrequest_mobile,webrequest_bits,webrequest_text
 
 # Name of the client as seen by kafka
 kafka.client.name=camus-webrequest-01

-- 
To view, visit https://gerrit.wikimedia.org/r/130346
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I05d77dae1b9f71ac2cb9ce1f331fec11a73fe760
Gerrit-PatchSet: 1
Gerrit-Project: analytics/kraken
Gerrit-Branch: master
Gerrit-Owner: Ottomata o...@wikimedia.org

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits


[MediaWiki-commits] [Gerrit] Can't use the same config file for eventlogging and webrequest - change (analytics/kraken)

2014-04-29 Thread Ottomata (Code Review)
Ottomata has submitted this change and it was merged.

Change subject: Can't use the same config file for eventlogging and webrequest
..


Can't use the same config file for eventlogging and webrequest

Duh!  They have different timestamps, don't know why I did this yesterday.

Change-Id: I05d77dae1b9f71ac2cb9ce1f331fec11a73fe760
---
C kraken-etl/conf/camus.eventlogging.properties
R kraken-etl/conf/camus.webrequest.properties
2 files changed, 11 insertions(+), 11 deletions(-)

Approvals:
  Ottomata: Verified; Looks good to me, approved



diff --git a/kraken-etl/conf/camus.properties 
b/kraken-etl/conf/camus.eventlogging.properties
similarity index 91%
copy from kraken-etl/conf/camus.properties
copy to kraken-etl/conf/camus.eventlogging.properties
index 02e2e4e..26adde6 100644
--- a/kraken-etl/conf/camus.properties
+++ b/kraken-etl/conf/camus.eventlogging.properties
@@ -7,11 +7,11 @@
 # where completed Camus job output directories are kept, usually a sub-dir in 
the base.path
 etl.execution.history.path=hdfs://kraken/wmf/camus/history
 
-# Our timestamps look like 2013-09-20T15:40:17
-camus.message.timestamp.format=-MM-dd'T'HH:mm:ss
+# Eventlogging timestamps are in unix epoch seconds
+camus.message.timestamp.format=unix
 
 # use the dt field
-camus.message.timestamp.field=dt
+camus.message.timestamp.field=timestamp
 
 # Store output into hourly buckets
 etl.output.file.time.partition.mins=60
@@ -33,7 +33,7 @@
 # Max hadoop tasks to use, each task can pull multiple topic partitions.
 # Currently importing mobile, bits, and eventlogging each with 10 partitions.  
Setting
 # map.tasks to 40.
-mapred.map.tasks=40
+mapred.map.tasks=10
 
 # Connection parameters.
 kafka.brokers=analytics1021.eqiad.wmnet:9092,analytics1022.eqiad.wmnet:9092
@@ -55,11 +55,11 @@
 # if whitelist has values, only whitelisted topic are pulled.  nothing on the 
blacklist is pulled
 kafka.blacklist.topics=
 
-# For now, we are only importing webrequest_mobile logs.
-kafka.whitelist.topics=webrequest_mobile,webrequest_bits,webrequest_text,eventlogging-00
+# Import eventlogging logs
+kafka.whitelist.topics=eventlogging-00
 
 # Name of the client as seen by kafka
-kafka.client.name=camus-webrequest-01
+kafka.client.name=camus-eventlogging-01
 
 # Fetch Request Parameters
 #kafka.fetch.buffer.size=
diff --git a/kraken-etl/conf/camus.properties 
b/kraken-etl/conf/camus.webrequest.properties
similarity index 95%
rename from kraken-etl/conf/camus.properties
rename to kraken-etl/conf/camus.webrequest.properties
index 02e2e4e..40c83ef 100644
--- a/kraken-etl/conf/camus.properties
+++ b/kraken-etl/conf/camus.webrequest.properties
@@ -31,9 +31,9 @@
 
etl.record.writer.provider.class=com.linkedin.camus.etl.kafka.common.SequenceFileRecordWriterProvider
 
 # Max hadoop tasks to use, each task can pull multiple topic partitions.
-# Currently importing mobile, bits, and eventlogging each with 10 partitions.  
Setting
-# map.tasks to 40.
-mapred.map.tasks=40
+# Currently importing mobile, and bits each with 10 partitions.  Setting
+# map.tasks to 30.
+mapred.map.tasks=30
 
 # Connection parameters.
 kafka.brokers=analytics1021.eqiad.wmnet:9092,analytics1022.eqiad.wmnet:9092
@@ -56,7 +56,7 @@
 kafka.blacklist.topics=
 
 # For now, we are only importing webrequest_mobile logs.
-kafka.whitelist.topics=webrequest_mobile,webrequest_bits,webrequest_text,eventlogging-00
+kafka.whitelist.topics=webrequest_mobile,webrequest_bits,webrequest_text
 
 # Name of the client as seen by kafka
 kafka.client.name=camus-webrequest-01

-- 
To view, visit https://gerrit.wikimedia.org/r/130346
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I05d77dae1b9f71ac2cb9ce1f331fec11a73fe760
Gerrit-PatchSet: 2
Gerrit-Project: analytics/kraken
Gerrit-Branch: master
Gerrit-Owner: Ottomata o...@wikimedia.org
Gerrit-Reviewer: Ottomata o...@wikimedia.org
Gerrit-Reviewer: jenkins-bot 

___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits