Re: [Talk-transit] GTFS compatibility

2010-06-30 Thread Joe Hughes
Many transport agencies/operators still don't have rider-facing stop
codes, and some may have them for only a subset of their stops.
However, as you say, where they *are* present, they present the most
stable dataset-local identifier for stop, if only because of the costs
involved in changing real-world signage.

The other contenders in GTFS are the stop_id (which is unstable,
originating as it does by the whims of the agency's database
maintainers) and the stop_name (which is more generally stable but
still subject to occasional tweaking, and which is not guaranteed to
be dataset unique).

It seems clear that there will always need to be some sort of fuzzy
matching of stops employed in subsequent updates/imports of data in
countries that have no national registry.  It also seems plausible to
me that someone could build an OSM-based shadow global stop registry,
and thus leapfrog national governments that haven't built such a
database of their own.

Cheers,
Joe

On Wed, Jun 30, 2010 at 4:27 PM, john whelan  wrote:
> I'm currently looking at Bus stops in Ottawa in OSM and finding similar
> issues with the existing bus_stops.  I'm seriously wondering where
> stop_codes exist if one approach might be to import bus_stops using GTFS
> data and use the GTFS tags such as stop_code etc from stops.txt
> http://code.google.com/transit/spec/transit_feed_specification.html#stops_txt___Field_Definitions
>
> Tools such as JOSM have a search facility so we should be able to search for
> bus_stops without a stop_code then reconcile them with the ones that have a
> stop_code tag.  If the GTFS data is wrong then we should be able to send a
> report somewhere probably the transit authority saying we think this stop is
> incorrect.
>
> My personal view is while we should respect work done already adding extra
> tags in this way doesn't remove this work and it is up to the rendering
> rules to either omit or include a particular bus_stop for display.  This can
> be selected by the presence or absence of a stop_code tag, certainly in
> Maperitive.
>
> If we were to do this we would probably need some sort of wiki write up and
> a standard way to label bus_stops.  Currently in Ottawa I've seen at least
> four different ways the ones with the bus route on being the least useful as
> they tend to be out of date.
>
> I don't think mapping routes works at all well.  Certainly in Ottawa the bus
> stops and stop_codes stay in the same physical place but the bus routes can
> be modified three or four times a year.  Some changes are greater than
> others and the transit route planning system that can be accessed from the
> web or by phone includes school buses which are not listed in the stop but
> do sometimes provide a useful and quicker way to get from point A to point
> B.
>
> Cheerio John
>
>
> On 30 June 2010 09:25, Hillsman, Edward  wrote:
>>
>> Our center has a project to explore the use of OSM as a repository and
>> tool for supporting multimodal trip planners (for example, bike to transit,
>> ride the bus, walk or bike to final destination). We are keenly interested
>> in the current discussion of transit and GTFS in OSM, because one of our
>> tasks is to develop software to import from GTFS into OSM, and then update
>> the import as a transit agency modifies its routes or stops, taking into
>> account that OSM mappers may have found and corrected errors in what was
>> uploaded (or may have introduced errors). I'm writing to share some of our
>> experience and get your suggestions. We will make the software we develop in
>> this project (for uploading, matching, and updating GTFS data in OSM)
>> publicly available.
>>
>> We think it should be relatively easy to upload a set of GTFS stops into
>> an area where no one has mapped bus stops into OSM. Generating the route
>> relations will be harder and we may not accomplish that as part of this
>> project. And we think that updating such data will be relatively simple,
>> because it can rely on tags identifying and cross-referencing the stops;
>> software would look for changes, and manual work would be needed to
>> reconcile them. The hard part is going to be designing the initial upload
>> process to work in areas where OSM already includes some bus stops, but not
>> all of them. In the state of Florida, where we are working, there are about
>> 450 stops already in OSM, many in areas served by transit agencies with GTFS
>> data. Obviously, we want to respect what has been mapped. Things that
>> complicate the initial upload include:
>>
>> (1) Locational errors in the GTFS data. These are not systematic, and some
>> are surprisingly large. One is more than 200 meters from its actual
>> location, and only about 10 meters from another stop that GTFS has within 10
>> meters of its actual location (and that is mapped accurately in OSM). We
>> came into this project knowing that there is locational error in GTFS. Now
>> we are trying to figure out how to deal with it. The GTFS location

Re: [Talk-transit] GTFS compatibility

2010-06-30 Thread john whelan
I'm currently looking at Bus stops in Ottawa in OSM and finding similar
issues with the existing bus_stops.  I'm seriously wondering where
stop_codes exist if one approach might be to import bus_stops using GTFS
data and use the GTFS tags such as stop_code etc from stops.txt
http://code.google.com/transit/spec/transit_feed_specification.html#stops_txt___Field_Definitions

Tools such as JOSM have a search facility so we should be able to search for
bus_stops without a stop_code then reconcile them with the ones that have a
stop_code tag.  If the GTFS data is wrong then we should be able to send a
report somewhere probably the transit authority saying we think this stop is
incorrect.

My personal view is while we should respect work done already adding extra
tags in this way doesn't remove this work and it is up to the rendering
rules to either omit or include a particular bus_stop for display.  This can
be selected by the presence or absence of a stop_code tag, certainly in
Maperitive.

If we were to do this we would probably need some sort of wiki write up and
a standard way to label bus_stops.  Currently in Ottawa I've seen at least
four different ways the ones with the bus route on being the least useful as
they tend to be out of date.

I don't think mapping routes works at all well.  Certainly in Ottawa the bus
stops and stop_codes stay in the same physical place but the bus routes can
be modified three or four times a year.  Some changes are greater than
others and the transit route planning system that can be accessed from the
web or by phone includes school buses which are not listed in the stop but
do sometimes provide a useful and quicker way to get from point A to point
B.

Cheerio John


On 30 June 2010 09:25, Hillsman, Edward  wrote:

> Our center has a project to explore the use of OSM as a repository and tool
> for supporting multimodal trip planners (for example, bike to transit, ride
> the bus, walk or bike to final destination). We are keenly interested in the
> current discussion of transit and GTFS in OSM, because one of our tasks is
> to develop software to import from GTFS into OSM, and then update the import
> as a transit agency modifies its routes or stops, taking into account that
> OSM mappers may have found and corrected errors in what was uploaded (or may
> have introduced errors). I'm writing to share some of our experience and get
> your suggestions. We will make the software we develop in this project (for
> uploading, matching, and updating GTFS data in OSM) publicly available.
>
> We think it should be relatively easy to upload a set of GTFS stops into an
> area where no one has mapped bus stops into OSM. Generating the route
> relations will be harder and we may not accomplish that as part of this
> project. And we think that updating such data will be relatively simple,
> because it can rely on tags identifying and cross-referencing the stops;
> software would look for changes, and manual work would be needed to
> reconcile them. The hard part is going to be designing the initial upload
> process to work in areas where OSM already includes some bus stops, but not
> all of them. In the state of Florida, where we are working, there are about
> 450 stops already in OSM, many in areas served by transit agencies with GTFS
> data. Obviously, we want to respect what has been mapped. Things that
> complicate the initial upload include:
>
> (1) Locational errors in the GTFS data. These are not systematic, and some
> are surprisingly large. One is more than 200 meters from its actual
> location, and only about 10 meters from another stop that GTFS has within 10
> meters of its actual location (and that is mapped accurately in OSM). We
> came into this project knowing that there is locational error in GTFS. Now
> we are trying to figure out how to deal with it. The GTFS locations do match
> those appearing in Google Transit, by the way.
> (2) Locational errors in the OSM data. These aren't systematic either but
> tend to be much smaller, except that in a few cases the stop has been
> recorded on the wrong side of the street, and a mapper in one city has
> recorded stops as nodes defining the street way rather than as points to the
> sides of the street.
> (3) Incomplete and inconsistent tagging of the OSM stops.
> (4) The presence in an area of stops for multiple agencies, only one of
> which has GTFS data. Our campus has a shuttle bus circulator system with no
> GTFS data (they operate without a set schedule but with a target 10-minute
> headway, and frequency changes during the day and with the university class
> schedule). The area's main public transportation agency has several routes
> that pass through the campus, and has GTFS data. Most of the public-agency
> stops on campus, but not all, are also campus shuttle stops, and there are
> many more shuttle stops on campus than there are public-agency stops.
> (5) Incomplete mapping of stops for each agency in OSM.
>
> At the momen

Re: [Talk-transit] GTFS compatibility

2010-06-30 Thread Joe Hughes
Ed,

Great to see someone from the CUTR efforts chiming in here.

Just to clarify one point, when you say "locational errors in GTFS",
you're referring to issues with the source data from the particular
agencies that you're working with, rather than anything having to do
with the representation format itself.  This has also been an
occasional issue with stop data being imported into OSM from NaPTAN.

One of the most important things that we can accomplish with these
efforts is to help find ways to establish two-way flows between these
official sources of data and the distributed army of volunteers and
developers who have a vested interest in improving the accuracy of
their local data.  The progress with this here in the UK has been slow
but encouraging, and there has lately been a lot of good work by the
transit agencies in Boston and New York to be more responsive to
feedback from consumers of the data.

Cheers,
Joe

On Wed, Jun 30, 2010 at 2:25 PM, Hillsman, Edward  wrote:
> Our center has a project to explore the use of OSM as a repository and tool 
> for supporting multimodal trip planners (for example, bike to transit, ride 
> the bus, walk or bike to final destination). We are keenly interested in the 
> current discussion of transit and GTFS in OSM, because one of our tasks is to 
> develop software to import from GTFS into OSM, and then update the import as 
> a transit agency modifies its routes or stops, taking into account that OSM 
> mappers may have found and corrected errors in what was uploaded (or may have 
> introduced errors). I'm writing to share some of our experience and get your 
> suggestions. We will make the software we develop in this project (for 
> uploading, matching, and updating GTFS data in OSM) publicly available.
>
> We think it should be relatively easy to upload a set of GTFS stops into an 
> area where no one has mapped bus stops into OSM. Generating the route 
> relations will be harder and we may not accomplish that as part of this 
> project. And we think that updating such data will be relatively simple, 
> because it can rely on tags identifying and cross-referencing the stops; 
> software would look for changes, and manual work would be needed to reconcile 
> them. The hard part is going to be designing the initial upload process to 
> work in areas where OSM already includes some bus stops, but not all of them. 
> In the state of Florida, where we are working, there are about 450 stops 
> already in OSM, many in areas served by transit agencies with GTFS data. 
> Obviously, we want to respect what has been mapped. Things that complicate 
> the initial upload include:
>
> (1) Locational errors in the GTFS data. These are not systematic, and some 
> are surprisingly large. One is more than 200 meters from its actual location, 
> and only about 10 meters from another stop that GTFS has within 10 meters of 
> its actual location (and that is mapped accurately in OSM). We came into this 
> project knowing that there is locational error in GTFS. Now we are trying to 
> figure out how to deal with it. The GTFS locations do match those appearing 
> in Google Transit, by the way.
> (2) Locational errors in the OSM data. These aren't systematic either but 
> tend to be much smaller, except that in a few cases the stop has been 
> recorded on the wrong side of the street, and a mapper in one city has 
> recorded stops as nodes defining the street way rather than as points to the 
> sides of the street.
> (3) Incomplete and inconsistent tagging of the OSM stops.
> (4) The presence in an area of stops for multiple agencies, only one of which 
> has GTFS data. Our campus has a shuttle bus circulator system with no GTFS 
> data (they operate without a set schedule but with a target 10-minute 
> headway, and frequency changes during the day and with the university class 
> schedule). The area's main public transportation agency has several routes 
> that pass through the campus, and has GTFS data. Most of the public-agency 
> stops on campus, but not all, are also campus shuttle stops, and there are 
> many more shuttle stops on campus than there are public-agency stops.
> (5) Incomplete mapping of stops for each agency in OSM.
>
> At the moment, we are rethinking the whole idea of trying to match the GTFS 
> stops to the OSM stops for the initial upload. One idea would be to screen 
> all stops in a GTFS area to look for tags indicating the operator (or no 
> operator), tag all of them with a FIXME describing that an upload has 
> occurred and may produce duplicates, but otherwise leave them alone, and then 
> upload the GTFS ones. I see problems with that, and in any case it should be 
> done only if there is a commitment by the uploader to work quickly to 
> reconcile the two data sets in OSM. Given the surprisingly large locational 
> errors in GTFS, I'm also uncomfortable with simply uploading it, because 
> putting bad data into the system will create confusion. I suspect 

[Talk-transit] Re: GTFS compatibility

2010-06-30 Thread Hillsman, Edward
Our center has a project to explore the use of OSM as a repository and tool for 
supporting multimodal trip planners (for example, bike to transit, ride the 
bus, walk or bike to final destination). We are keenly interested in the 
current discussion of transit and GTFS in OSM, because one of our tasks is to 
develop software to import from GTFS into OSM, and then update the import as a 
transit agency modifies its routes or stops, taking into account that OSM 
mappers may have found and corrected errors in what was uploaded (or may have 
introduced errors). I'm writing to share some of our experience and get your 
suggestions. We will make the software we develop in this project (for 
uploading, matching, and updating GTFS data in OSM) publicly available.

We think it should be relatively easy to upload a set of GTFS stops into an 
area where no one has mapped bus stops into OSM. Generating the route relations 
will be harder and we may not accomplish that as part of this project. And we 
think that updating such data will be relatively simple, because it can rely on 
tags identifying and cross-referencing the stops; software would look for 
changes, and manual work would be needed to reconcile them. The hard part is 
going to be designing the initial upload process to work in areas where OSM 
already includes some bus stops, but not all of them. In the state of Florida, 
where we are working, there are about 450 stops already in OSM, many in areas 
served by transit agencies with GTFS data. Obviously, we want to respect what 
has been mapped. Things that complicate the initial upload include:

(1) Locational errors in the GTFS data. These are not systematic, and some are 
surprisingly large. One is more than 200 meters from its actual location, and 
only about 10 meters from another stop that GTFS has within 10 meters of its 
actual location (and that is mapped accurately in OSM). We came into this 
project knowing that there is locational error in GTFS. Now we are trying to 
figure out how to deal with it. The GTFS locations do match those appearing in 
Google Transit, by the way.
(2) Locational errors in the OSM data. These aren't systematic either but tend 
to be much smaller, except that in a few cases the stop has been recorded on 
the wrong side of the street, and a mapper in one city has recorded stops as 
nodes defining the street way rather than as points to the sides of the street.
(3) Incomplete and inconsistent tagging of the OSM stops. 
(4) The presence in an area of stops for multiple agencies, only one of which 
has GTFS data. Our campus has a shuttle bus circulator system with no GTFS data 
(they operate without a set schedule but with a target 10-minute headway, and 
frequency changes during the day and with the university class schedule). The 
area's main public transportation agency has several routes that pass through 
the campus, and has GTFS data. Most of the public-agency stops on campus, but 
not all, are also campus shuttle stops, and there are many more shuttle stops 
on campus than there are public-agency stops.
(5) Incomplete mapping of stops for each agency in OSM.

At the moment, we are rethinking the whole idea of trying to match the GTFS 
stops to the OSM stops for the initial upload. One idea would be to screen all 
stops in a GTFS area to look for tags indicating the operator (or no operator), 
tag all of them with a FIXME describing that an upload has occurred and may 
produce duplicates, but otherwise leave them alone, and then upload the GTFS 
ones. I see problems with that, and in any case it should be done only if there 
is a commitment by the uploader to work quickly to reconcile the two data sets 
in OSM. Given the surprisingly large locational errors in GTFS, I'm also 
uncomfortable with simply uploading it, because putting bad data into the 
system will create confusion. I suspect this is a problem with all uploads. 
We've certainly seen it with the TIGER street data.

But we are still in the thinking-about-this stage, haven't made any decisions, 
and are looking for suggestions and comments (hence this posting). Until we get 
a much better handle on the initial upload problems, any actual uploading we do 
as part of the project will be limited to the area of our campus, where we know 
what is actually on the ground and can clean up anything we do. We'd definitely 
enjoy sharing work and ideas.

Ed Hillsman

Edward L. Hillsman, Ph.D.
Senior Research Associate
Center for Urban Transportation Research
University of South Florida
4202 Fowler Ave., CUT100
Tampa, FL  33620-5375
813-974-2977 (tel)
813-974-5168 (fax)
hills...@cutr.usf.edu
http://www.cutr.usf.edu



On Tue, 29 Jun 2010 15:26:07 +0100 Joe Hughes  wrote:
>I agree that it would be helpful to end up with something that allows
>straightforward conversions to and from the GTFS format.  GTFS is a
>CC-licensed specification [1] which is evolved by an open community
>process [2].  Also, the great majority of U.S