Hello Seniors >From where i can get the city shap fie of Tarapur, Aurangabad, Nashik this all cities are in Maharashtra State
Uzair On Mon, Feb 7, 2022 at 12:42 PM Piyush Kumar <psh.kumar1...@gmail.com> wrote: > Thank you Sanjay and Nikhil. I think these are good starting points to try > and figure out how to get this done and I am sure with some time and > effort, it is possible. > > Piyush > > On Sun, 6 Feb 2022 at 17:48, Nikhil VJ <nikhil...@gmail.com> wrote: > >> Hi, >> >> I don't think Selenium is required - this looks like it can be done with >> just varying the request payload of one POST api call. >> POST api call to URL: >> https://missionantyodaya.nic.in/preloginVillageInfrastructureReports2020.html >> the POST request content type is application/x-www-form-urlencoded >> >> at *state level*, request payload is like: >> stateCode: 27 >> stateName: MAHARASHTRA >> districtCode: >> districtName: >> blockCode: >> blockName: >> gpCode: >> gpName: >> >> It* district level* it becomes: >> stateCode: 27 >> stateName: MAHARASHTRA >> districtCode: 469 >> districtName: AURANGABAD >> blockCode: >> blockName: >> gpCode: >> gpName: >> >> then *block level*: >> stateCode: 27 >> stateName: MAHARASHTRA >> districtCode: 469 >> districtName: AURANGABAD >> blockCode: 4315 >> blockName: KHULTABAD >> gpCode: >> gpName: >> >> then* GP level:* >> stateCode: 27 >> stateName: MAHARASHTRA >> districtCode: 469 >> districtName: AURANGABAD >> blockCode: 4315 >> blockName: KHULTABAD >> gpCode: 170584 >> gpName: BODKHA >> >> If in python, one can use Beautifulscrape to capture the table data as >> well as get the (code + name) pairs for the next level. >> >> -- >> Cheers, >> Nikhil VJ >> https://nikhilvj.co.in >> >> >> On Fri, Feb 4, 2022 at 1:42 PM Sanjay Bhangar <sanjaybhan...@gmail.com> >> wrote: >> >>> Piyush - >>> >>> You could write a python (or your preferred language) script that just >>> requests the HTML, parses it, and follows the hierarchy, without using >>> selenium. This could be a bunch of work as the site doesn't use regular >>> links with GET requests, but rather when you click on a state in the table, >>> it uses Javascript to fill up hidden form fields with the state code, etc. >>> and then does a form submit, causing a POST request to be made with those >>> values. >>> >>> For eg. you can see the links in the table have an onClick handler like >>> "selectState(2,'HIMACHAL >>> PRADESH','preloginDistrictInfrastructureReports2020.html')" . >>> >>> Then, in the javascript, you can see the selectState function defined >>> like so: >>> >>> function selectState(stateCode,stateName,action){ >>> $("#stateCode").val(stateCode); >>> $("#stateName").val(stateName); >>> $("#reportForm").attr('action', action); >>> $("#reportForm").submit(); >>> >>> } >>> >>> In this JS file: >>> https://missionantyodaya.nic.in/resources/antyodaya/js/custom/prelogin/reports/preloginReport.js >>> >>> So this will make a POST request to >>> preloginDistrictInfrastructureReports2020.html >>> with stateCode=2, stateName=HIMACHAL PRADESH >>> >>> Similarly, there are different onCick handlers defined for selecting >>> districts, etc. that you can follow down to see what URLs they are calling >>> with what parameters. And in theory, you could write some HTML parsing code >>> and some regex to go through the items in each table, parse out the >>> parameters and URLs to call, and follow things down. >>> >>> So, in theory you could write this without mucking around with selenium, >>> but it also seems like a lot more work than if the site was structured >>> "normally" with unique URLs and GET requests. >>> >>> For the page numbering, this seems okay: the HTML outputs all the items >>> across all the pages, and then the actual pagination on the page is purely >>> client-side javascript - so if you were to read the HTML on the page via >>> python or so, you would just get all the items in the table without having >>> to worry about pagination. >>> >>> Unfortunately, this does seem like a lot of work and I don't really have >>> the time to do anything, but it seemed like an interesting problem and I >>> was curious so I took a look. Hope it could help a bit. >>> >>> All the best, >>> Sanjay >>> >>> On Fri, Feb 4, 2022 at 1:03 PM Piyush Kumar <psh.kumar1...@gmail.com> >>> wrote: >>> >>>> Could folks here suggest how to go about this? >>>> >>>> >>>> https://missionantyodaya.nic.in/preloginStateInfrastructureReports2020.html >>>> >>>> When we click this link, we get data on village-level infrastructure >>>> put within multiple HTML tables across many pages (separated into state, >>>> dist., block etc.) >>>> >>>> Suppose I want to scrape data upto the village level for a particular >>>> state, is there any way I can get it done without too much back and forth >>>> over Selenium webdriver? Please note that to access village level data you >>>> have to go through a nested hierarchy of links (gram panchyt within block, >>>> which is within a district and so on). To make matters more complicated, >>>> the pages have also not been numbered. >>>> >>>> Can someone in the know help me figure this out? >>>> >>>> Thanks in advance >>>> Piyush >>>> >>>> -- >>>> Datameet is a community of Data Science enthusiasts in India. Know more >>>> about us by visiting http://datameet.org >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "datameet" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to datameet+unsubscr...@googlegroups.com. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/datameet/CAFtOtdujRhq36O4SW%3Dtie%2BSDH_6Pq1R87B6nVerzU4giQVka%3Dw%40mail.gmail.com >>>> <https://groups.google.com/d/msgid/datameet/CAFtOtdujRhq36O4SW%3Dtie%2BSDH_6Pq1R87B6nVerzU4giQVka%3Dw%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- >>> Datameet is a community of Data Science enthusiasts in India. Know more >>> about us by visiting http://datameet.org >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "datameet" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to datameet+unsubscr...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/datameet/CAG3W7ZE475WmeyR6Y9uXhKNh%3DLL7%3DhCwgeCjZ_fciEdWcfR_pA%40mail.gmail.com >>> <https://groups.google.com/d/msgid/datameet/CAG3W7ZE475WmeyR6Y9uXhKNh%3DLL7%3DhCwgeCjZ_fciEdWcfR_pA%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- >> Datameet is a community of Data Science enthusiasts in India. Know more >> about us by visiting http://datameet.org >> --- >> You received this message because you are subscribed to the Google Groups >> "datameet" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to datameet+unsubscr...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/datameet/CAH7jeuNzEB%3DUVqgG0mYVtrKjWTHeAdN6d_%3DFnz9LLCsE4QH1eA%40mail.gmail.com >> <https://groups.google.com/d/msgid/datameet/CAH7jeuNzEB%3DUVqgG0mYVtrKjWTHeAdN6d_%3DFnz9LLCsE4QH1eA%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> > -- > Datameet is a community of Data Science enthusiasts in India. Know more > about us by visiting http://datameet.org > --- > You received this message because you are subscribed to the Google Groups > "datameet" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to datameet+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/datameet/CAFtOtduoUWJ6aQH69XfmUgnxXuQoJ1bRRMb1u-2Kznja9cSCtg%40mail.gmail.com > <https://groups.google.com/d/msgid/datameet/CAFtOtduoUWJ6aQH69XfmUgnxXuQoJ1bRRMb1u-2Kznja9cSCtg%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > -- Datameet is a community of Data Science enthusiasts in India. Know more about us by visiting http://datameet.org --- You received this message because you are subscribed to the Google Groups "datameet" group. To unsubscribe from this group and stop receiving emails from it, send an email to datameet+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/datameet/CADmA_V7szo2gAiSMeRWFbrkjOrRuVZNN%3Deru2kUOJFKginQ8oA%40mail.gmail.com.