Hello! I would like to submit the first of a set of performance
optimizations to the
splitter that improve the performance and reduce memory usage.
The end result is that a core 2 duo can split the entire planet in 3.5 hours
with
-Xmx1700m and without using any temporary space. (67 minutes on a 8gb core
i7)
Everything is up on github. I'm sending the first set of patches now.

Perhaps the biggest change is that I've made the splitter accept my binary
OSM format.
I created a binary OSM format that is 30-50% smaller than the bzipped
planet (about 5.3gb without metadata) and 5x-10x faster to parse than
the gzipped planet. As that format is substantially smaller than the
splitter's cache, and to simplify my patch, I removed the cache before
refactoring the
splitter. My original announcement is at
http://lists.openstreetmap.org/pipermail/dev/2010-April/019370.html
Since that post I have fixed the issue of unwanted extra precision.

I want your thoughts on the binary format before submitting it. Please have
a look and tell me what you think. In this email I am submitting some other
optimizations and patches. The performance numbers
I quote assume that I was using the binary format; I am
unsure what benefit they have on trunk.

In IntIntMap, the current offset value for rehashing is way to low,
given the dense nature of OSM nodeID values and a sequential insertion
order. This problem is most obvious on a whole-planet split.
It results in huge numbers of collisions; requiring 30+ probes
on average. Note when benchmarking that the collision rate is
irregular throughout the planet file; with a --max-nodes=4000000 whole
planet split, there's virtually no performance differences on the
first 90M nodes, yet by 120M nodes, using a random 31-bit prime number
is over 3x faster.

My multithread changes reduced the real and user CPU time by 30-50% on a uk
extract. Other tests, e.g., a whole planet, show almost no effect. I think
this is because a
lot of my changes increase the efficiency of code that is parallelized,
which paradoxically
ends up reducing the relative benefits of parellelization.

The single biggest remaining slowdown is that for each point, the splitter
sends each Node to each area (represented
by an OSMWriter), which writes it in the output file if it is in the bounds
of the region (with overlap), or does
nothing otherwise. On a whole planet, with 840 areas and 550M nodes, that is
over 5 trillion bounds-checks that are
not run in parallel. An index that can can map a node to the regions that
may contain it could avoid 99% of these checks.

Scott
From 3d3d7275b8c83ecda4e71a59023c4f11485388db Mon Sep 17 00:00:00 2001
From: Scott Crosby <scro...@cs.rice.edu>
Date: Tue, 18 May 2010 21:14:52 -0500
Subject: [PATCH 4/8] Lazily generate the tag hash table.

---
 src/uk/me/parabola/splitter/Element.java |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/src/uk/me/parabola/splitter/Element.java b/src/uk/me/parabola/splitter/Element.java
index 6404cb1..0ef23da 100644
--- a/src/uk/me/parabola/splitter/Element.java
+++ b/src/uk/me/parabola/splitter/Element.java
@@ -12,6 +12,7 @@
  */
 package uk.me.parabola.splitter;
 
+import java.util.Collections;
 import java.util.HashMap;
 import java.util.Iterator;
 import java.util.Map;
@@ -20,7 +21,7 @@ import java.util.Map;
  * @author Steve Ratcliffe
  */
 public class Element {
-	private Map<String, String> tags = new HashMap<String, String>(8);
+	private Map<String, String> tags; 
 	private int id;
 
 	protected void setId(int id) {
@@ -34,11 +35,16 @@ public class Element {
 	public void reset() {
 		this.id = 0;
 		tags.clear();
+		tags = null;
 	}
 
 	public void addTag(String key, String value) {
 		if (key.equals("created_by"))
 			return;
+		// Most elements are nodes. Most nodes have no tags. Create the tag table lazily
+		if (tags == null)
+			tags = new HashMap<String, String>(4);
+
 		tags.put(key, value);
 	}
 
@@ -47,6 +53,8 @@ public class Element {
 	}
 
 	public Iterator<Map.Entry<String, String>> tagsIterator() {
+		if (tags == null)
+			return Collections.EMPTY_SET.iterator();
 		return tags.entrySet().iterator();
 	}
 }
-- 
1.7.1

From e3d967c1630b918278867c04bca8a51c577284c0 Mon Sep 17 00:00:00 2001
From: Scott Crosby <scro...@cs.rice.edu>
Date: Tue, 11 May 2010 08:35:19 -0500
Subject: [PATCH 5/8] Change the offset for the hash table.

Cuts the time spent in this hash table by 3x-6x.
---
 src/uk/me/parabola/splitter/IntIntMap.java |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/src/uk/me/parabola/splitter/IntIntMap.java b/src/uk/me/parabola/splitter/IntIntMap.java
index 51f9310..53f281d 100644
--- a/src/uk/me/parabola/splitter/IntIntMap.java
+++ b/src/uk/me/parabola/splitter/IntIntMap.java
@@ -38,7 +38,7 @@ public class IntIntMap {
 	//private int hit;
 	//private int miss;
 
-	private static final int OFF = 7;
+	private static final int OFF = 1472057057;
 
 	public IntIntMap() {
 		this(INIT_SIZE, 0.9f);
-- 
1.7.1

From daddb0abd69bd4cfb2c25f391dc94ca1e31df93c Mon Sep 17 00:00:00 2001
From: Scott Crosby <scro...@cs.rice.edu>
Date: Tue, 11 May 2010 08:35:40 -0500
Subject: [PATCH 6/8] IntIntMap: Simplify hash search function.

---
 src/uk/me/parabola/splitter/IntIntMap.java |   31 ++++++++++------------------
 1 files changed, 11 insertions(+), 20 deletions(-)

diff --git a/src/uk/me/parabola/splitter/IntIntMap.java b/src/uk/me/parabola/splitter/IntIntMap.java
index 53f281d..6b4899c 100644
--- a/src/uk/me/parabola/splitter/IntIntMap.java
+++ b/src/uk/me/parabola/splitter/IntIntMap.java
@@ -35,8 +35,8 @@ public class IntIntMap {
 
 	private int targetSize;
 	private final float loadFactor;
-	//private int hit;
-	//private int miss;
+	//private static long hit;
+	//private static long miss;
 
 	private static final int OFF = 1472057057;
 
@@ -131,26 +131,17 @@ public class IntIntMap {
 	}
 
 	private int keyPos(int key) {
-		int k = key & (capacity - 1);
-
+		int mask = capacity - 1;
+		int k = key & mask;
 		int h1 = keys[k];
-		if (h1 != 0 && h1 != key) {
-			for (int k2 = k+OFF; ; k2+= OFF) {
-				//miss++;
-				if (k2 >= capacity)
-					//noinspection AssignmentToForLoopParameter
-					k2 -= capacity;
-
-				int fk = keys[k2];
-				if (fk == 0 || fk == key) {
-					//hit++;
-					//if ((size % 100000) == 0)
-					//	System.out.printf("hit/miss %f at size %d, %d\n",  100.0*hit/(miss - hit), size, targetSize);
-
-					return k2;
-				}
-			}
+		// hit++;
+		while (h1 != 0 && h1 != key) {
+			// miss++;
+			k = (k + OFF) & mask;
+			h1 = keys[k];
 		}
+		// if (((hit) % 1000000) == 0)
+		//     System.out.printf("hit/probe %f at hit %d probe %d, %d\n",100.0*hit/(miss + hit), hit, hit+miss,size);
 		return k;
 	}
 
-- 
1.7.1

From efcae96e87c95c49177f06c953b7c765db674718 Mon Sep 17 00:00:00 2001
From: Scott Crosby <scro...@cs.rice.edu>
Date: Wed, 9 Jun 2010 23:02:55 -0500
Subject: [PATCH 8/8] New thread design.

---
 src/uk/me/parabola/splitter/SplitProcessor.java |  122 ++++++++++-------------
 1 files changed, 53 insertions(+), 69 deletions(-)

diff --git a/src/uk/me/parabola/splitter/SplitProcessor.java b/src/uk/me/parabola/splitter/SplitProcessor.java
index a0bf0d5..5c6b834 100644
--- a/src/uk/me/parabola/splitter/SplitProcessor.java
+++ b/src/uk/me/parabola/splitter/SplitProcessor.java
@@ -16,22 +16,26 @@ import java.io.IOException;
 import java.util.ArrayList;
 import java.util.BitSet;
 import java.util.Date;
+import java.util.List;
 import java.util.concurrent.ArrayBlockingQueue;
 import java.util.concurrent.BlockingQueue;
 
+
 /**
  * Splits a map into multiple areas.
  */
 class SplitProcessor implements MapProcessor {
-
+	public static final int NO_ELEMENTS = 10;
+	public static final int BUNDLE_SIZE = 2000;
+	
+	
 	private final SplitIntMap coords = new SplitIntMap();
 	private final SplitIntMap ways = new SplitIntMap();
 	private final IntObjMap<long[]> bigWays = new IntObjMap<long[]>();
 
 	private final OSMWriter[] writers;
-	private final BlockingQueue<Element>[] writerInputQueues;
-	private final BlockingQueue<InputQueueInfo> writerInputQueue;
-	private final ArrayList<Thread> workerThreads;
+	private final BlockingQueue<List<Element>>[] writerInputQueues;
+	private final List<Element>[] bundlingQueues;
 
 	private Node currentNode = new Node();
 	private int currentNodeAreaSet;
@@ -44,27 +48,23 @@ class SplitProcessor implements MapProcessor {
 	
 	private final int maxThreads;
 
+	Thread futures[];
+		
 	SplitProcessor(OSMWriter[] writers, int maxThreads) {
 		this.writers = writers;
 		this.maxThreads = maxThreads;
-		this.writerInputQueue = new ArrayBlockingQueue<InputQueueInfo>(writers.length); 
 		this.writerInputQueues = new BlockingQueue[writers.length];
+		this.bundlingQueues = new ArrayList[writers.length];
+		this.futures = new Thread[writers.length];
 		for (int i = 0; i < writerInputQueues.length;i++) {
-			writerInputQueues[i] = new ArrayBlockingQueue<Element>(NO_ELEMENTS);
-			writerInputQueue.add(new InputQueueInfo(this.writers[i], writerInputQueues[i]));
+			writerInputQueues[i] = new ArrayBlockingQueue<List<Element>>(NO_ELEMENTS);
+			bundlingQueues[i] = new ArrayList<Element>(BUNDLE_SIZE);
+			futures[i] = new Thread(new OSMWriterWorker(writers[i],writerInputQueues[i]));
+			futures[i].start();
 		}
 
 		currentWayAreaSet = new BitSet(writers.length);
 		currentRelAreaSet = new BitSet(writers.length);
-		
-		int noOfWorkerThreads = this.maxThreads - 1;
-		workerThreads = new ArrayList<Thread>(noOfWorkerThreads);
-		for (int i = 0; i < noOfWorkerThreads; i++) {
-			Thread worker = new Thread(new OSMWriterWorker());
-			worker.setName("worker-" + i);
-			workerThreads.add(worker);
-			worker.start();
-		}
 	}
 
 	@Override
@@ -206,20 +206,18 @@ class SplitProcessor implements MapProcessor {
 
 	@Override
 	public void endMap() {
-		for (int i = 0; i < writerInputQueues.length; i++) {
-			try {
-				writerInputQueues[i].put(STOP_ELEMENT);
-			} catch (InterruptedException e) {
-				throw new RuntimeException("Failed to add the stop element for worker thread " + i, e);
-			}
-		}
-		for (Thread workerThread : workerThreads) {
-			try {
-				workerThread.join();
+		try {
+			// Push the stop element into every queue.
+			for (int i = 0 ; i < futures.length ; i++ )
+				addToWorkingQueue(i,STOP_ELEMENT);
+			// Wait for them to all exit.
+			for (int i = 0 ; i < futures.length ; i++ )
+				futures[i].join();
 			} catch (InterruptedException e) {
-				throw new RuntimeException("Failed to join for thread " + workerThread.getName(), e);
+				// TODO Auto-generated catch block
+				e.printStackTrace();
 			}
-		}
+
 		for (OSMWriter writer : writers) {
 			writer.finishWrite();
 		}
@@ -320,29 +318,35 @@ class SplitProcessor implements MapProcessor {
 	}
 
 	private void addToWorkingQueue(int writerNumber, Element element) {
+		List<Element> bundle=bundlingQueues[writerNumber];
+		bundle.add(element);
+		if (bundle.size() < BUNDLE_SIZE && element != STOP_ELEMENT)
+			return;
 		try {
-			writerInputQueues[writerNumber].put(element);
+			BlockingQueue<List<Element>> queue = writerInputQueues[writerNumber];
+			//System.out.println("Putting: "+element.toString());
+			queue.put(bundle);
+			bundlingQueues[writerNumber] = new ArrayList<Element>(BUNDLE_SIZE);
 		} catch (InterruptedException e) {
 			throw new RuntimeException("Failed to write node " + element.getId() + " to worker thread " + writerNumber, e);
 		}
 	}
 
-	private static class InputQueueInfo {
-		private final OSMWriter writer;
-		private final BlockingQueue<Element> inputQueue;
-
-		public InputQueueInfo(OSMWriter writer, BlockingQueue<Element> inputQueue) {
-      this.writer = writer;
-			this.inputQueue = inputQueue;
-		}
-	}
 
 	private static final Element STOP_ELEMENT = new Element();
 
-	public static final int NO_ELEMENTS = 1000;
 
 	private class OSMWriterWorker implements Runnable {
 
+		
+		private OSMWriter writer;
+		private BlockingQueue<List<Element>> queue;
+
+		public OSMWriterWorker(OSMWriter writer, BlockingQueue<List<Element>> queue) {
+			this.writer = writer;
+			this.queue = queue;
+		}
+
 		public void processElement(Element element, OSMWriter writer) throws IOException {
 			if (element instanceof Node) {
 				writer.write((Node) element);
@@ -355,44 +359,24 @@ class SplitProcessor implements MapProcessor {
 
 		@Override
 		public void run() {
-			boolean finished = false;
-			while (!finished) {
-				InputQueueInfo workPackage = writerInputQueue.poll();
-				if (workPackage==null) {
-					finished=true;
-				} else {
-					while (!workPackage.inputQueue.isEmpty()) {
-						Element element =null;
-						try {
-							element = workPackage.inputQueue.poll();
-							if (element == null) {
-								writerInputQueue.put(workPackage);
-								workPackage=null;
-								break;
-							} else if (element == STOP_ELEMENT) {
-								workPackage=null;
-								System.out.println("Thread " + Thread.currentThread().getName() + " has finished");
-								// this writer is finished
-								break;
+			while (true) {
+				//System.out.println("Doing loop");
+				try {
+						
+					List<Element> elements = queue.take();
+					for (Element element : elements)
+					if (element == STOP_ELEMENT) {
+								return;
 							} else {
-								processElement(element, workPackage.writer);
+								processElement(element, writer);
 							}
 							
 						} catch (InterruptedException e) {
 							throw new RuntimeException("Thread " + Thread.currentThread().getName() + " failed to get next element", e);
 						} catch (IOException e) {
-							throw new RuntimeException("Thread " + Thread.currentThread().getName() + " failed to write element " + element.getId() + '(' + element.getClass().getSimpleName() + ')', e);
-						}
-					}
-					if (workPackage != null) {
-						try {
-							writerInputQueue.put(workPackage);
-						} catch (InterruptedException e) {
-							throw new RuntimeException("Thread " + Thread.currentThread().getName() + " failed to return work package", e);
+							throw new RuntimeException("Thread " + Thread.currentThread().getName() + " failed to write element ",e);
 						}
 					}
-				}
-			}
 		}
 	}
 }
\ No newline at end of file
-- 
1.7.1

_______________________________________________
mkgmap-dev mailing list
mkgmap-dev@lists.mkgmap.org.uk
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Reply via email to