Re: How to build xml sitemap with cron job and make available publicly
On Mon, Jan 19, 2015 at 1:26 PM, Thiago H de Paula Figueiredo < thiag...@gmail.com> wrote: > On Mon, 19 Jan 2015 19:05:49 -0200, George Christman < > gchrist...@cardaddy.com> wrote: > >> Well that is what I'm currently doing, but the problem I'm facing is >> the app generates millions of pages and your only allowed to have 50k >> per sitemap. >> > Not really a problem. See http://www.sitemaps.org/protocol.html, section > Using Sitemap index files (to group multiple sitemap files). > >> Is there a way to get around the permission issue and write the file >> to webapp or will I be required to have to figure out an alternate >> approach as you suggested? >> > Just write to the right folder. You were trying to write to the > filesystem root, which wouldn't even work to get the sitemap web-accessible > even if there was no permission problem. Make your code write the file to > your expanded WAR root folder in Tomcat or other servlet container. This > link, http://www.avajava.com/tutorials/lessons/how-do-i- > get-the-location-of-my-web-application-context-in-the-file-system.html, > should help you find the right folder. In general, that's a bad advice. Containers are not required to expand the WAR file at all and your files would be destroyed every time you deploy them. I think the general accepted best practice is to pass a writable directory root as an argument to your web application and write the files there, for example /var//data. With T5, I usually supply the path using symbol, in development environments pointing to build root and explicitly set to an absolute path in production mode. Kalle > > > -- > Thiago H. de Paula Figueiredo > Tapestry, Java and Hibernate consultant and developer > http://machina.com.br > > - > To unsubscribe, e-mail: users-unsubscr...@tapestry.apache.org > For additional commands, e-mail: users-h...@tapestry.apache.org > >
Re: How to build xml sitemap with cron job and make available publicly
On Mon, 19 Jan 2015 19:05:49 -0200, George Christman wrote: Well that is what I'm currently doing, but the problem I'm facing is the app generates millions of pages and your only allowed to have 50k per sitemap. Not really a problem. See http://www.sitemaps.org/protocol.html, section Using Sitemap index files (to group multiple sitemap files). Is there a way to get around the permission issue and write the file to webapp or will I be required to have to figure out an alternate approach as you suggested? Just write to the right folder. You were trying to write to the filesystem root, which wouldn't even work to get the sitemap web-accessible even if there was no permission problem. Make your code write the file to your expanded WAR root folder in Tomcat or other servlet container. This link, http://www.avajava.com/tutorials/lessons/how-do-i-get-the-location-of-my-web-application-context-in-the-file-system.html, should help you find the right folder. -- Thiago H. de Paula Figueiredo Tapestry, Java and Hibernate consultant and developer http://machina.com.br - To unsubscribe, e-mail: users-unsubscr...@tapestry.apache.org For additional commands, e-mail: users-h...@tapestry.apache.org
Re: How to build xml sitemap with cron job and make available publicly
This is what I'm currently doing but it isn't working very well do to the size of these things. public StreamResponse onActivate(String categoryType) { UrlSet urlset = new UrlSet(); List results = crudDAO.getAll(ZipDetail.class); for (ZipDetail zipDetail : results) { Link link = linkSource.createPageRenderLinkWithContext(VehiclesIndex.class, categoryType, Util.getCityContext(urlEncoder, zipDetail)); link.setSecurity(LinkSecurity.SECURE); SitemapXML siteMapXML = new SitemapXML(); siteMapXML.setChangefreq(ChangeFreq.ALWAYS.toString()); siteMapXML.setLoc(link.toAbsoluteURI()); siteMapXML.setPriority(0.5); urlset.getSitemaps().add(siteMapXML); } return super.onActivate(urlset); } On Mon, Jan 19, 2015 at 4:05 PM, George Christman wrote: > Well that is what I'm currently doing, but the problem I'm facing is > the app generates millions of pages and your only allowed to have 50k > per sitemap. So that is the first major issue I'm facing which this > library addresses. I was hoping to put this on a nightly task so a > search engine wouldn't be triggering the process of generating this > multiple times. > > Is there a way to get around the permission issue and write the file > to webapp or will I be required to have to figure out an alternate > approach as you suggested? > > On Mon, Jan 19, 2015 at 3:58 PM, Thiago H de Paula Figueiredo > wrote: >> On Mon, 19 Jan 2015 18:41:31 -0200, George Christman >> wrote: >> >>> Caused by: java.io.FileNotFoundException: /sitemap.xml (Permission denied) >> >> >> Here's the cause: you're trying to write a file to the root folder of your >> filesystem. >> >> Sitemaps are simple, so why don't you write a Tapestry page to generate it >> pluse a little bit of URL rewriting to make it available at /sitemap.xml, >> taking into account what you know about your webapp and caching the data so >> you don't need to perform the whole process every time the /sitemap.xml URL >> is requested? The library you're using for that is probably crawling the >> whole website, hence causing the high load you noticed. >> >> >> -- >> Thiago H. de Paula Figueiredo >> Tapestry, Java and Hibernate consultant and developer >> http://machina.com.br >> >> - >> To unsubscribe, e-mail: users-unsubscr...@tapestry.apache.org >> For additional commands, e-mail: users-h...@tapestry.apache.org >> > > > > -- > George Christman > CEO > www.CarDaddy.com > P.O. Box 735 > Johnstown, New York -- George Christman CEO www.CarDaddy.com P.O. Box 735 Johnstown, New York - To unsubscribe, e-mail: users-unsubscr...@tapestry.apache.org For additional commands, e-mail: users-h...@tapestry.apache.org
Re: How to build xml sitemap with cron job and make available publicly
Well that is what I'm currently doing, but the problem I'm facing is the app generates millions of pages and your only allowed to have 50k per sitemap. So that is the first major issue I'm facing which this library addresses. I was hoping to put this on a nightly task so a search engine wouldn't be triggering the process of generating this multiple times. Is there a way to get around the permission issue and write the file to webapp or will I be required to have to figure out an alternate approach as you suggested? On Mon, Jan 19, 2015 at 3:58 PM, Thiago H de Paula Figueiredo wrote: > On Mon, 19 Jan 2015 18:41:31 -0200, George Christman > wrote: > >> Caused by: java.io.FileNotFoundException: /sitemap.xml (Permission denied) > > > Here's the cause: you're trying to write a file to the root folder of your > filesystem. > > Sitemaps are simple, so why don't you write a Tapestry page to generate it > pluse a little bit of URL rewriting to make it available at /sitemap.xml, > taking into account what you know about your webapp and caching the data so > you don't need to perform the whole process every time the /sitemap.xml URL > is requested? The library you're using for that is probably crawling the > whole website, hence causing the high load you noticed. > > > -- > Thiago H. de Paula Figueiredo > Tapestry, Java and Hibernate consultant and developer > http://machina.com.br > > - > To unsubscribe, e-mail: users-unsubscr...@tapestry.apache.org > For additional commands, e-mail: users-h...@tapestry.apache.org > -- George Christman CEO www.CarDaddy.com P.O. Box 735 Johnstown, New York - To unsubscribe, e-mail: users-unsubscr...@tapestry.apache.org For additional commands, e-mail: users-h...@tapestry.apache.org
Re: How to build xml sitemap with cron job and make available publicly
On Mon, 19 Jan 2015 18:41:31 -0200, George Christman wrote: Caused by: java.io.FileNotFoundException: /sitemap.xml (Permission denied) Here's the cause: you're trying to write a file to the root folder of your filesystem. Sitemaps are simple, so why don't you write a Tapestry page to generate it pluse a little bit of URL rewriting to make it available at /sitemap.xml, taking into account what you know about your webapp and caching the data so you don't need to perform the whole process every time the /sitemap.xml URL is requested? The library you're using for that is probably crawling the whole website, hence causing the high load you noticed. -- Thiago H. de Paula Figueiredo Tapestry, Java and Hibernate consultant and developer http://machina.com.br - To unsubscribe, e-mail: users-unsubscr...@tapestry.apache.org For additional commands, e-mail: users-h...@tapestry.apache.org
Re: How to build xml sitemap with cron job and make available publicly
Hi Thiago, I actually just stumbled upon the sitemapgen4j library and yes I'm using maven. https://code.google.com/p/sitemapgen4j/ java doc http://sitemapgen4j.googlecode.com/svn-history/r7/site/javadoc/com/redfin/sitemapgenerator/WebSitemapGenerator.html#WebSitemapGenerator%28java.lang.String,%20java.io.File%29 It shows the following example WebSitemapGenerator wsg = new WebSitemapGenerator("http://www.example.com";, myDir); if I try writing to that directory like so, File directory = new File("/"); Link link = linkSource.createPageRenderLink(Index.class); link.setSecurity(LinkSecurity.SECURE); WebSitemapGenerator wsg = new WebSitemapGenerator(link.toAbsoluteURI(), directory); I get the following exception. Any idea what I might be doing wrong? ioc.Registry Problem writing sitemap file /sitemap.xml ioc.Registry Operations trace: ioc.Registry [ 1] Handling traditional 'action' component event request for account/admin/Index:sitemap. ioc.Registry [ 2] Triggering event 'action' on account/admin/Index:sitemap TapestryModule.RequestExceptionHandler Processing of request failed with uncaught exception: org.apache.tapestry5.ioc.internal.OperationException: Problem writing sitemap file /sitemap.xml [at classpath:com/cardaddy/auto/pages/account/admin/AdminIndex.tml, line 32] org.apache.tapestry5.ioc.internal.OperationException: Problem writing sitemap file /sitemap.xml [at classpath:com/cardaddy/auto/pages/account/admin/AdminIndex.tml, line 32] at org.apache.tapestry5.ioc.internal.OperationTrackerImpl.logAndRethrow(OperationTrackerImpl.java:184) at org.apache.tapestry5.ioc.internal.OperationTrackerImpl.invoke(OperationTrackerImpl.java:90) at org.apache.tapestry5.ioc.internal.PerThreadOperationTracker.invoke(PerThreadOperationTracker.java:72) at org.apache.tapestry5.ioc.internal.RegistryImpl.invoke(RegistryImpl.java:1258) at org.apache.tapestry5.internal.structure.ComponentPageElementResourcesImpl.invoke(ComponentPageElementResourcesImpl.java:154) at org.apache.tapestry5.internal.structure.ComponentPageElementImpl.triggerContextEvent(ComponentPageElementImpl.java:1045) at org.apache.tapestry5.internal.services.ComponentEventRequestHandlerImpl.handle(ComponentEventRequestHandlerImpl.java:73) at org.apache.tapestry5.internal.services.AjaxFilter.handle(AjaxFilter.java:42) at $ComponentEventRequestHandler_4f615eb941cad.handle(Unknown Source) at org.apache.tapestry5.upload.internal.services.UploadExceptionFilter.handle(UploadExceptionFilter.java:76) at $ComponentEventRequestHandler_4f615eb941cad.handle(Unknown Source) at org.apache.tapestry5.modules.TapestryModule$37.handle(TapestryModule.java:2220) at $ComponentEventRequestHandler_4f615eb941cad.handle(Unknown Source) at $ComponentEventRequestHandler_4f615eb941a5c.handle(Unknown Source) at org.apache.tapestry5.internal.services.ComponentRequestHandlerTerminator.handleComponentEvent(ComponentRequestHandlerTerminator.java:43) at org.apache.tapestry5.internal.services.DeferredResponseRenderer.handleComponentEvent(DeferredResponseRenderer.java:45) at $ComponentRequestHandler_4f615eb941a5e.handleComponentEvent(Unknown Source) at org.apache.tapestry5.services.InitializeActivePageName.handleComponentEvent(InitializeActivePageName.java:39) at $ComponentRequestHandler_4f615eb941a5e.handleComponentEvent(Unknown Source) at org.apache.tapestry5.internal.services.RequestOperationTracker$1.perform(RequestOperationTracker.java:55) at org.apache.tapestry5.internal.services.RequestOperationTracker$1.perform(RequestOperationTracker.java:52) at org.apache.tapestry5.ioc.internal.OperationTrackerImpl.perform(OperationTrackerImpl.java:110) at org.apache.tapestry5.ioc.internal.PerThreadOperationTracker.perform(PerThreadOperationTracker.java:84) at org.apache.tapestry5.ioc.internal.RegistryImpl.perform(RegistryImpl.java:1264) at org.apache.tapestry5.internal.services.RequestOperationTracker.handleComponentEvent(RequestOperationTracker.java:47) at $ComponentRequestHandler_4f615eb941a5e.handleComponentEvent(Unknown Source) at org.tynamo.security.SecurityComponentRequestFilter.handleComponentEvent(SecurityComponentRequestFilter.java:41) at $ComponentRequestFilter_4f615eb941a5b.handleComponentEvent(Unknown Source) at $ComponentRequestHandler_4f615eb941a5e.handleComponentEvent(Unknown Source) at $ComponentRequestHandler_4f615eb941a28.handleComponentEvent(Unknown Source) at org.apache.tapestry5.internal.services.ComponentEventDispatcher.dispatch(ComponentEventDispatcher.java:48) at $Dispatcher_4f615eb941a29.dispatch(Unknown Source) at $Dispatcher_4f615eb941a22.dispatch(Unknown Source) at org.apache.tapestry5.modules.TapestryModule$RequestHandlerTerminator.service(TapestryModule.java:304) at org.apache.tapestry5.internal.services.RequestErrorFilter.service(RequestErrorFilter.java:26) at $RequestHandler
Re: Tapestry 5.4-beta-26
On Mon, 19 Jan 2015 17:23:36 -0200, Howard Lewis Ship wrote: - Split more code out into smaller modules, to encourage reuse (even outside of a Tapestry web application) Specifically, two new subprojects/JARs were created: BeanModel (everything you need to use BeanModel without any dependence on tapestry-core or tapestry-ioc) and Common (stuff from tapestry-ioc and tapestry-core that is used in the BeanModel subproject, but represents concepts that aren't specific to BeanModel, such as Registry, TypeCoercer, etc). This is all 100% backward-compatible: all classes that were moved to either BeanModel or Commons have the same names and belong to the same packages as before. This was done with the intention of using the Tapestry's BeanModel classes as a very fast Java property reflection library without the need to use Tapestry-core (the Web framework) nor Tapestry-IoC. -- Thiago H. de Paula Figueiredo Tapestry, Java and Hibernate consultant and developer http://machina.com.br - To unsubscribe, e-mail: users-unsubscr...@tapestry.apache.org For additional commands, e-mail: users-h...@tapestry.apache.org
Re: How to build xml sitemap with cron job and make available publicly
On Mon, 19 Jan 2015 17:13:26 -0200, George Christman wrote: Hi guys, I'm looking to build nightly sitemaps and make them available publicly. The problem I'm facing is once I create the sitemap, where do I put it so that it's available publicly? The root context folder (/src/main/webapp if you're using Maven). -- Thiago H. de Paula Figueiredo Tapestry, Java and Hibernate consultant and developer http://machina.com.br - To unsubscribe, e-mail: users-unsubscr...@tapestry.apache.org For additional commands, e-mail: users-h...@tapestry.apache.org
Tapestry 5.4-beta-26
This release is now available for download, or from Maven central. Notable fixes and improvements since the previous beta: - Fixed problems with tracking validation errors on fields inside Ajax updates - Improvements to the exception report page, and the ability to write an exception report text file - Fixed the layout issues related to the use of the Autocomplete mixin - Converted the Autocomplete mixin to use Twitter Typeahead 0.10.5 - Fixed most compatibility issues with Internet Explorer 8 - Split more code out into smaller modules, to encourage reuse (even outside of a Tapestry web application) - Fixed issues related to keyboard navigation and the modal dialog created by the Confirm mixin - Removed the unused FormInjector component Full details: http://tapestry.apache.org/2015/01/19/tapestry-54-beta-26.html -- Howard M. Lewis Ship Creator of Apache Tapestry The source for Tapestry training, mentoring and support. Contact me to learn how I can get you up and productive in Tapestry fast! (971) 678-5210 http://howardlewisship.com @hlship
How to build xml sitemap with cron job and make available publicly
Hi guys, I'm looking to build nightly sitemaps and make them available publicly. The problem I'm facing is once I create the sitemap, where do I put it so that it's available publicly? Currently it's available in the Web Pages package where it has some links in the root xml file that points to dynamically generated xml pages, but that is putting a huge load on the system everytime a search engine grabs them. Now the next question is once the location has been established, how do I get it there? Thanks in advance, George - To unsubscribe, e-mail: users-unsubscr...@tapestry.apache.org For additional commands, e-mail: users-h...@tapestry.apache.org