Many thanks for doing this. This is invaluable! 

In 2013, I tried to get CDs of the Maharashtra Gazette because of the some 
of the same issues with their website that you have mentioned. I have quite 
a story to tell. (I intended to use the copy of the gazette to maintain 
references on several articles on Wikipedia that had suffered link rot.)

Thanks,
Rohini
On Wednesday, 11 June 2025 at 14:51:20 UTC+5:30 sreeram kandimalla wrote:

> Thanks Arun, but this was an almost decade old project from Carl 
> Malamund from Public Resource <https://public.resource.org/> and Sushant 
> Sinha from IndianKanoon <https://indiankanoon.org/>. 
>
> More websites showed up since they started the work, so I helped with 
> closing the missing states. 
>
> The plan is to consolidate in one repo and maintain it, this is going to 
> be painful because the sites keep disappearing, changing the backend 
> software, changing domain names and in one case corrupting their DNS 
> entries( all of which have happened while I was trying to get the crawlers 
> running ).
>
>
>
> On Wed, Jun 11, 2025 at 1:52 PM Arun Ganesh <[email protected]> wrote:
>
>> This is hero's work and an extraordinary resource for the Government 
>> itself!
>>
>> Thank you Sreeram for making this happen.
>>
>> On Wed, Jun 11, 2025 at 12:48 PM sreeram kandimalla <
>> [email protected]> wrote:
>>
>>> Hi all,
>>>
>>> We now have most of the Indian central and state gazettes archived at 
>>> https://archive.org/details/gazetteofindia?sort=-date
>>>
>>> There are crawlers running daily out of the code at the egazette 
>>> <https://github.com/sushant354/egazette> repo and my temporary fork 
>>> <https://github.com/ramSeraph/egazette> of the same.
>>>
>>> One of the advantages of having the data at archive.org is that it 
>>> comes with automatic OCR(using tesseract), a free text search engine and a 
>>> possibility to get a RSS feed based on a search query. I hope people 
>>> build some useful things with it. 
>>>
>>> The following states and union territories currently have problems:
>>>
>>>    1. *Andaman and Nicobar islands*: Site doesn't have current data. 
>>>    2. *Jammu and Kashmir*: Site is offline. Hopefully temporarily.
>>>    3. *Mizoram:* Data is not being updated at source.
>>>    4. *Meghalaya:* Data delayed by 3 months
>>>    5. *West Bengal*: No gazette site could be found. Would appreciate 
>>>    it if anyone can locate it( https://www.wbgazettepart2.in/ is not it 
>>>    ). 
>>>
>>> Thanks,
>>> Sreeram K
>>>
>>> -- 
>>> Datameet is a community of Data Science enthusiasts in India. Know more 
>>> about us by visiting http://datameet.org
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "datameet" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> To view this discussion visit 
>>> https://groups.google.com/d/msgid/datameet/CAMgvHC5sttm0hoajbFySGRRVHUmHKM2d3e-_NtmpooSUxAd1OQ%40mail.gmail.com
>>>  
>>> <https://groups.google.com/d/msgid/datameet/CAMgvHC5sttm0hoajbFySGRRVHUmHKM2d3e-_NtmpooSUxAd1OQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> -- 
>> Datameet is a community of Data Science enthusiasts in India. Know more 
>> about us by visiting http://datameet.org
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "datameet" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>>
> To view this discussion visit 
>> https://groups.google.com/d/msgid/datameet/CA%2BGKQr21HZGLA%2BcMc%2BbpxpUyDz2TOUBEvRQCNto1AcSAmpvE%2BA%40mail.gmail.com
>>  
>> <https://groups.google.com/d/msgid/datameet/CA%2BGKQr21HZGLA%2BcMc%2BbpxpUyDz2TOUBEvRQCNto1AcSAmpvE%2BA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/datameet/b01c55c4-0957-4323-80e8-4622a4370efen%40googlegroups.com.

Reply via email to