* [email protected] <[email protected]> [181009 17:12]: > > Hello Darshit Shah, > > Thank you for your welcome message. I am glad to be part of your project! > > I don't understand the term "javascript engine". AFAK javascript is code that > run on the browser side, and we have no problem fetching it. > Exactly! Javascript is code that is executed on the client side and hence requires a javascript engine which interprets the code and executes it. However, Wget does not and will not package a javscript engine in order to run those scripts. This means, sites where Javascript is used to create hyperlinks won't work well when scraped through Wget. > > There might be an "ajax" issues with sites rely on it. Ajax is dealt heavy by > programmers and they will have to take some action on their site to > incorporate the engine.
Similarly, sites that use Javascript to show menus or create AJAX requests are usually not amenable to being scraped as a static HTML page. > > POST requests to comments and mail will need to taken care of so they will > work on static site. One solution is to do hosted supplier that will carry > the task and deliver spam removal as well. > I think I will be able to a howto document on that. > > Michael > > -----Original Message----- > From: Darshit Shah <[email protected]> > Sent: Tuesday, 9 October, 2018 2:52 PM > To: [email protected] > Cc: [email protected] > Subject: Re: [Bug-wget] Hello again > > Hi Michael, > > Nice to hear from you again. I vaguely remember a mention of someone who > wanted > to work on this feature. When deciding to make this work, please remember that > any of this can only work if the site does not rely on Javascript; which given > Wordpress is a difficult thing. The reason for this is that we do _not_ intend > to ship a javascript engine alongwith Wget2. It is too large, unwieldy and too > much of a maintenance nightmare. However, if the site can work without > Javascript, then I would assume that Wget2 can already handle making a static > copy. If it can't handle something, please let us know / file a bug report > about it. > > Of course, I welcome you to work on Wget2 as you see fit. And we would love to > look at any contributions you can make. We will also try and help you out as > much as possible when dealing with the codebase. > > About the dev setup, I only use vim and gdb to work with Wget. As Tim has > already mentioned, he uses Netbeans and might be able to help you out. > > You also mentioned something about the lib/ directory. That is an > auto-generated dir with compatibility libs that you don't need to care about. > All the code for Wget2 is in src/ and the code for the library is in libwget/. > Those are the two main directories you need to care about. And of course > tests/ > for the tests. > > * [email protected] <[email protected]> [181008 21:22]: > > > > Hello again, > > > > My name is Michael. I have approached you about a year ago. > > > > I am interested in making wget2 a tool that can convert content management > > systems (like WordPress) output to HTML. This actually limits the content > > management system to generate the website every time it is changed, and the > > presentation is done using the HTTP server only. > > > > This is an important feature as it prevents security risk - penetration of > > hacker to the site and installing viruses or stealing data. > > It also allows the website to be delivered much faster as no PHP code needs > > to run in order to deliver the content. Google already announced that site > > download speed is a factor in its SEO evaluation. > > > > I will be able to work for 3 hours every week on the project. I do need some > > guidance from you. > > > > I have started to configure Netbeans IDE as using a debugger can help me > > delve into the code much faster. There are some issues with the Netbeans. Do > > you use Id? Which one? > > > > Best regards, > > > > Michael > > > > > > > > > > -- > Thanking You, > Darshit Shah > PGP Fingerprint: 7845 120B 07CB D8D6 ECE5 FF2B 2A17 43ED A91A 35B6 > > -- Thanking You, Darshit Shah PGP Fingerprint: 7845 120B 07CB D8D6 ECE5 FF2B 2A17 43ED A91A 35B6
signature.asc
Description: PGP signature
