Nuria has submitted this change and it was merged.

Change subject: Fix inconsistencies when using --all-projects
......................................................................


Fix inconsistencies when using --all-projects

When using --all-projects flag, all.csv files are generated.
Those contain the sum of all other projects' values. Date
range parameters do not affect these files, because they are meant
to be always a correct sum of the existing per project files.
So, always, regardless of the date range parameters, those files
will be recalculated using the whole date range of the existing
per project files.

This change fixes an inconsistency that made all.csv files have
different time ranges depending on if they had daily or another
granularity (weekly, monthly, etc.). Now all all.csv files should
have the same date range, which should match the per project files.

Bug: T106554
Change-Id: I2a3aa90618bbe402a7116857aaf96c1f87a71f0d
---
M aggregator/projectcounts.py
M tests/test_projectcounts/test_per_project_aggregation.py
2 files changed, 45 insertions(+), 2 deletions(-)

Approvals:
  Nuria: Verified; Looks good to me, approved



diff --git a/aggregator/projectcounts.py b/aggregator/projectcounts.py
index 80fb0f5..5cf6132 100644
--- a/aggregator/projectcounts.py
+++ b/aggregator/projectcounts.py
@@ -575,12 +575,14 @@
 
     # Writes aggregations across all projects
     if compute_all_projects:
+        oldest_date = util.parse_string_to_date(min(all_projects_data.keys()))
+        newest_date = util.parse_string_to_date(max(all_projects_data.keys()))
         _write_raw_and_aggregated_csv_data(
             target_dir_abs,
             'all',
             all_projects_data,
-            first_date,
-            last_date,
+            oldest_date,
+            newest_date,
             additional_aggregators,
             bad_dates,
             force_recomputation)
diff --git a/tests/test_projectcounts/test_per_project_aggregation.py 
b/tests/test_projectcounts/test_per_project_aggregation.py
index b5f66c3..14da8d1 100644
--- a/tests/test_projectcounts/test_per_project_aggregation.py
+++ b/tests/test_projectcounts/test_per_project_aggregation.py
@@ -366,3 +366,44 @@
             '2014-11-02,321,321,0,0',
             '2014-11-03,310,310,0,0',
             ])
+
+    def test_update_per_project_compute_all_projects_outside_range(self):
+        fixture = self.get_fixture_dir_abs(
+            '2014-11-3projects-for-aggregation')
+
+        # ask to just compute day 3
+        first_date = datetime.date(2014, 11, 3)
+        last_date = datetime.date(2014, 11, 3)
+
+        enwiki_file_abs = os.path.join(self.daily_raw_dir_abs, 'enwiki.csv')
+        dewiki_file_abs = os.path.join(self.daily_raw_dir_abs, 'dewiki.csv')
+        frwiki_file_abs = os.path.join(self.daily_raw_dir_abs, 'frwiki.csv')
+        # however, files already contain counts for the other days
+        self.create_file(enwiki_file_abs, [
+            '2014-11-01,103,103,0,0',
+            '2014-11-02,108,108,0,0'
+        ])
+        self.create_file(dewiki_file_abs, [
+            '2014-11-01,121,121,0,0',
+            '2014-11-02,105,105,0,0'
+        ])
+        self.create_file(frwiki_file_abs, [
+            '2014-11-01,99,99,0,0',
+            '2014-11-02,108,108,0,0'
+        ])
+
+        aggregator.update_per_project_csvs_for_dates(
+            fixture,
+            self.data_dir_abs,
+            first_date,
+            last_date,
+            compute_all_projects=True)
+
+        # although the call was just for the day 3
+        # all dates are calculated for the totals file
+        all_file_abs = os.path.join(self.daily_raw_dir_abs, 'all.csv')
+        self.assert_file_content_equals(all_file_abs, [
+            '2014-11-01,323,323,0,0',
+            '2014-11-02,321,321,0,0',
+            '2014-11-03,310,310,0,0',
+            ])

-- 
To view, visit https://gerrit.wikimedia.org/r/241620
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I2a3aa90618bbe402a7116857aaf96c1f87a71f0d
Gerrit-PatchSet: 1
Gerrit-Project: analytics/aggregator
Gerrit-Branch: master
Gerrit-Owner: Mforns <[email protected]>
Gerrit-Reviewer: Nuria <[email protected]>
Gerrit-Reviewer: jenkins-bot <>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to