On Sun, 14 Jun 2015 09:45:00 +0100 Mick <michaelkintz...@gmail.com> said:
> On Sunday 14 Jun 2015 08:36:06 you wrote: > > On Sun, 14 Jun 2015 08:27:44 +0100 Mick <michaelkintz...@gmail.com> said: > > > On Saturday 13 Jun 2015 03:59:18 Carsten Haitzler wrote: > > > > and yes - i've tried googles control panel. i manually > > > > > > > > > > submitted a bunch of urls/pages (but not the url above), so thats' > > > > > > the only reason they got indexed. gogole simply isn't spidering. > > > > > > :( > > > > > > If you have a robots.txt or site.xml at the webroot, have you checked > > > there is no clash there? > > > > https://www.enlightenment.org/robots.txt > > > > User-agent: * > > Allow: / > > > > there is no site.xml ... or might it be that dokuwiki turns site.xml into a > > "this topic doesnt exist" page... ? does that mess things up? (dokuwiki > > turns any unknown page into that "this doesnt exist yet" allowing you to > > create it) > > That's OK then, because there would be no clash if site.xml does not exist. > :-) > > I also noticed that there is not sitemap.xml, which is not necessary, but > would help to index pages that may have been missed out for some reason and > also propose a priority for each. yeah - this is needed if we have unlinked pages to index. we don't though. everything is simply linked. those links dont have anything like "nofollow". https://www.enlightenment.org has: <a href="/docs">Docs</a> https://www.enlightenment.org/docs has: <a href="/docs/efl/advanced/start" class="wikilink1" title="docs:efl:advanced:start">Advanced EFL Topics</a> https://www.enlightenment.org/docs/efl/advanced/start has: <a href="/docs/efl/advanced/dnd" class="wikilink1" title="docs:efl:advanced:dnd">DND (Drag and Drop)</a> https://www.enlightenment.org/docs/efl/advanced/dnd has a paragraph like: There are two applications visible on the screen. The first is a window with a blue background and a gengrid containing 8 different images. if i google for exactly the above string - nothing. every page in that chain has <meta name="robots" content="index,follow"/> in the <head>... garhhhr! why? > I used to run a python script provided by Google some years ago, but with the > advent of CMS script like this should be part of the CMS engine. Have a look > here > > https://support.google.com/webmasters/answer/156184?hl=en > > and follow the link on the right hand side navigation menu to borrow some > code for building a sitemap.xml sure - i read about the stiemap.xml - yes. then every time anything changes on e.org i have to provide it (and google limit you per month to i think 5 to 10 manual updates). that is really only needed if we have unlinked content. we don't. :( this is a workaround to the core problem... what IS the core problem? :( > Here's one I built for you using an online generator, but I can't guarantee > that it is syntactically correct for Google/Bing/Yahoo requirements: > =================================================================== > <?xml version="1.0" encoding="UTF-8"?> > <urlset > xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 > http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"> > <!-- created with Free Online Sitemap Generator www.xml-sitemaps.com --> > > <url> > <loc>https://www.enlightenment.org/</loc> > </url> > <url> > <loc>https://www.enlightenment.org/start</loc> > </url> > <url> > <loc>https://www.enlightenment.org/about</loc> > </url> > <url> > <loc>https://www.enlightenment.org/download</loc> > </url> > <url> > <loc>https://www.enlightenment.org/contact</loc> > </url> > <url> > <loc>https://www.enlightenment.org/docs</loc> > </url> > <url> > <loc>https://www.enlightenment.org/about-enlightenment</loc> > </url> > <url> > <loc>https://www.enlightenment.org/contribute</loc> > </url> > <url> > <loc>https://www.enlightenment.org/about-efl</loc> > </url> > <url> > <loc>https://www.enlightenment.org/about-terminology</loc> > </url> > <url> > <loc>https://www.enlightenment.org/about-rage</loc> > </url> > <url> > <loc>https://www.enlightenment.org/about-edi</loc> > </url> > <url> > <loc>https://www.enlightenment.org/media</loc> > </url> > <url> > <loc>https://www.enlightenment.org/docs-efl-start</loc> > </url> > <url> > <loc>https://www.enlightenment.org/contact/arcanist</loc> > </url> > <url> > <loc>https://www.enlightenment.org/docs/c/start</loc> > </url> > <url> > <loc>https://www.enlightenment.org/docs/efl/start</loc> > </url> > <url> > <loc>https://www.enlightenment.org/docs/efl/advanced/start</loc> > </url> > <url> > <loc>https://www.enlightenment.org/_detail/e-logo-title.svg?id=media</loc> > </url> > <url> > <loc>https://www.enlightenment.org/docs/efl/mainloop</loc> > </url> > <url> > <loc>https://www.enlightenment.org/docs/efl/ecore_idlers</loc> > </url> > <url> > <loc>https://www.enlightenment.org/docs/efl/ref/c/key/evas_object_del</loc> > </url> > <url> > <loc>https://www.enlightenment.org/docs/efl/advanced/dnd</loc> > </url> > <url> > > <loc>https://www.enlightenment.org/docs/efl/ref/c/key/evas_object_ref%28%29</loc> > </url> > <url> > <loc>https://www.enlightenment.org/docs/efl/ref/c/key/evas_object_ref</loc> > </url> > <url> > > <loc>https://www.enlightenment.org/docs/efl/ref/c/key/evas_object_del%28%29</loc> > </url> > </urlset> > =================================================================== > > Google unfriendly URLs like "evas_object_del%28%29" are not going to help, so > you may want to edit those. Then add it to the webroot and tell Google about > it, or it will (should) pick it up anyway at its next crawl. i'm actually not worried about those... i'm worried about core pages with actual content. -- ------------- Codito, ergo sum - "I code, therefore I am" -------------- The Rasterman (Carsten Haitzler) ras...@rasterman.com ------------------------------------------------------------------------------ _______________________________________________ enlightenment-users mailing list enlightenment-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-users