Hi Devdatta,

Thanks for asking, happy to provide the documentation here:

*file* : name of the GTFS-RT feed file that was processed (there's one
every 30 secs, so named by epoch timestamp of downloading)
*num_vehicles* : number of vehicle locations (entities) in the feed
*feed_timestamp* : timestamp of the feed that's there in the header. If you
got the same value as the last feed, it's duplicate. (Whoops I just found
out there were some 13k dupes in my data!)
*feed_time* : human, IST form of timestamp
*incrementality* : that's another field in the feed's header
*bad_count* : number of entities that had missing or flawed lat-longs like
0,0
*earliest* : timestamp of the earliest (farthest back in time) vehicle
entity in the file (apart from feed timestamp, each vehicle location
'entity' must carry a timestamp of its own, because we can't assume all
700-odd vehicles sent in their lat-longs at exactly the same time. We're
doing asynchronous business here.)
*diff1* : seconds gap between earliest and feed_timestamp
*latest* : time of latest entity
*diff2* : seconds gap between latest and feed_timestamp

What diff1, diff2 tells me : How "dated" is the information in the feed;
and how "recent" is the information in it. Supposing diff2 was too large
consistently, then I wouldn't bother to download a fresh feed every 30 secs
which is the standard minimum refreshing time as per GTFS-RT specs.

For reference, Sample json output from a feed:

{'header': {'gtfsRealtimeVersion': '2.0',
  'incrementality': 'FULL_DATASET',
  'timestamp': '1550596818'},
 'entity': [{'id': 'vehicle',
   'vehicle': {'trip': {'tripId': '6255', 'routeId': '225'},
    'position': {'latitude': 28.610946655273438,
     'longitude': 76.980224609375,
     'speed': 0.0},
    'timestamp': '1550596773',
    'vehicle': {'id': 'DL1PD0716', 'label': 'DL1PD0716'}}},

Repeat-link to my code example:
http://kyso.io/answerquest/delhi-gtfs-rt-feed-file-analysis

--
Cheers,
Nikhil VJ, Pune, India
http://nikhilvj.co.in


On Fri, Mar 22, 2019 at 9:53 AM Devdatta Tengshe <devda...@tengshe.in>
wrote:

> Hi Nikhil,
>
> Thanks for sharing this data.
> I had a question about the 'delhi_vehicle_reports.csv' file.
>
> If there any documentation about the fields in this file?
>
> I see the following headers:
>
>>
>> file,num_vehicles,feed_timestamp,feed_time,incrementality,bad_count,earliest,diff1,latest,diff2
>>
> & I'm wondering what they are
>
> Regards,
> Devdatta
>
>
> On Thu, Mar 21, 2019 at 1:24 PM Nikhil VJ <nikhil...@gmail.com> wrote:
>
>> Hi folks,
>>
>> I have been archiving Delhi's bus realtime gtfs feeds on my server since
>> a month now and collating the data into a flat CSV. Sharing that data at
>> this link for download and analysis:
>> https://server.nikhilvj.co.in/place1/
>>
>> Hoping some folks can make some analysis, visualizations or so of it - I
>> don't have time to delve too much into that right now. It's been a great
>> learning experience arranging the scripts and structures on my digitalocean
>> server to make this long-term continuous archival process possible.
>>
>> Disclaimer: Default reply to every sage advice that starts with "Why
>> don't you.." is : "Sounds good, please do it and get back with the
>> results." I'm satisfied at my end and am sharing the data wealth here for
>> others to take forward, so don't bug me, just take it and go! ;)
>>
>>
>> Some more notes:
>> 1. Get 7zip portable / p7zipfull to uncompress it. The uncompressed one
>> is around 6 gigs, compressed its about half a gig.
>> 2. There may be many repetitions in the data though, since the feeds were
>> coming in every 30 secs, plus from my last email the analysis showed there
>> were repetitions in one feed itself. So it's a data cleaning challenge for
>> you here to remove repetitions. (Do it - don't expect things to be already
>> done for you unless you're paying a fortune for it!)
>> 3. If there is too much traffic on my server then I'll lock it all down
>> with username-password restrictions. So don't do silly things like telling
>> a whole class of students to download it from online only. Use a pen drive
>> or your LAN.
>> 4. There is an accompanying reports csv that gives file-level summaries.
>> 5. Timestamps are in epoch format in UTC timezone (as per GTFS-RT specs).
>> Lookup "epoch converter". In the reports file i've added 5.5hrs to get
>> human times in IST.
>> 6. The data inside is covering all the dates from 19 Feb onwards. Moving
>> forward I might make the scripts store things month-wise or week-wise. Here
>> it was important to start asap.
>> 7. Every early morning my scripts will place a fresh version of the data
>> there and remove the previous day's one. So don't be downloading stuff from
>> there at 5am.
>> 8. Tip : Python? Wanna map? Check out folium.
>> 9. Tip : Folium? Wanna share the ipynb notebook? Check out kyso.io
>>
>> PS: Thanks JohnsonC for the kind words. But that is because I use
>> datameet from google groups instead of from my mailbox, unless it's an
>> immediate followup. So it's like stackexchange for me, and it saves me time
>> and effort.
>>
>> Cheers
>> Nikhil VJ
>> Pune, India
>>
>>
>> On Thursday, March 14, 2019 at 11:50:24 AM UTC+5:30, JohnsonC wrote:
>>>
>>>
>>> This is helpful.
>>> Thanks for updating on this Nikhil.
>>> This thread was from November and you bothered to search and update it.
>>>
>>> Thanks,
>>>
>>>
>>> On Wed, 13 Mar 2019 at 20:38, Nikhil VJ <nikh...@gmail.com> wrote:
>>>
>>>> Hi Folks,
>>>>
>>>> Sometime last month my API key for the realtime feed of Delhi bus data
>>>> started working. (link to register for y
>>>> <https://otd.delhi.gov.in/data/realtime/>ours).
>>>>
>>>> Here's an "unboxing" of one gtfs-realtime vehicleposition feed file
>>>> from there:
>>>> http://kyso.io/answerquest/delhi-gtfs-rt-feed-file-analysis
>>>>
>>>>
>>>> Note: I'm guessing this is not DTC but other bus services operating in
>>>> Delhi.
>>>>
>>>>
>>>> - Nikhil VJ
>>>> Pune, India
>>>> https://nikhilvj.co.in
>>>>
>>>>
>>>> On Thursday, November 29, 2018 at 5:17:39 PM UTC+5:30, Nikhil VJ wrote:
>>>>>
>>>>> Hi Arun,
>>>>>
>>>>> This data doesn't include any shapes.txt file, probably that script
>>>>> requires it. shapes.txt is not mandatory in GTFS. The routes are defined 
>>>>> as
>>>>> sequence of stops in stop_times.txt (multiplied by the number of trips in 
>>>>> a
>>>>> day that is).
>>>>>
>>>>> There's room here for improvement. Here's a full gtfs validator output
>>>>> for the delhi data :
>>>>> http://nikhilvj.co.in/files/delhi_gtfs/delhi-gtfs-2.html
>>>>>
>>>>> One peculiarity : The routes have been split up into separate onward
>>>>> and return journey routes.
>>>>>
>>>>> If anybody knows someone on the technical team of this, kindly connect
>>>>> me with them. The project leads are probably too busy with handling
>>>>> realtime data access requests and won't take too kindly to feedback about
>>>>> what improvements can be done on static side, but I might be able to put
>>>>> something across to the technical folks.
>>>>>
>>>>> You can zip up and import the static GTFS files on static GTFS Manager
>>>>> <https://github.com/WRI-Cities/static-GTFS-manager> tool. If someone
>>>>> wants to draw the shapefiles of the routes and add them in, the "Default
>>>>> Sequence" page will help you do that.
>>>>>
>>>>> --
>>>>> Cheers,
>>>>> Nikhil VJ
>>>>> +91-966-583-1250
>>>>> Pune, India
>>>>> http://nikhilvj.co.in
>>>>>
>>>>>
>>>>> On Tue, Nov 27, 2018 at 1:50 AM Arun Ganesh  wrote:
>>>>>
>>>>>> Was anyone able to convert the GTFS feed into a geojson?
>>>>>>
>>>>>> Tried https://github.com/BlinkTagInc/gtfs-to-geojson but for some
>>>>>> reason does not produce any route lines.
>>>>>>
>>>>>> --
>>>>>> Datameet is a community of Data Science enthusiasts in India. Know
>>>>>> more about us by visiting http://datameet.org
>>>>>> ---
>>>>>>
>>>>>> --
>>>> Datameet is a community of Data Science enthusiasts in India. Know more
>>>> about us by visiting http://datameet.org
>>>> ---
>>>> You received this message because you are subscribed to the Google
>>>> Groups "datameet" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to datameet+u...@googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>> --
>>> Warm Regards,
>>> Johnson Chetty
>>>
>>>
>>>
>>>
>>> --
>> Datameet is a community of Data Science enthusiasts in India. Know more
>> about us by visiting http://datameet.org
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "datameet" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to datameet+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
> Datameet is a community of Data Science enthusiasts in India. Know more
> about us by visiting http://datameet.org
> ---
> You received this message because you are subscribed to the Google Groups
> "datameet" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to datameet+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to datameet+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to