On Sun, 14 Jun 2015 09:45:00 +0100 Mick <michaelkintz...@gmail.com> said:

> On Sunday 14 Jun 2015 08:36:06 you wrote:
> > On Sun, 14 Jun 2015 08:27:44 +0100 Mick <michaelkintz...@gmail.com> said:
> > > On Saturday 13 Jun 2015 03:59:18 Carsten Haitzler wrote:
> > > > and yes - i've tried googles control panel. i manually
> > > > 
> > > > > > submitted a bunch of urls/pages (but not the url above), so thats'
> > > > > > the only reason they got indexed. gogole simply isn't spidering.
> > > > > > :(
> > > 
> > > If you have a robots.txt or site.xml at the webroot, have you checked
> > > there is no clash there?
> > 
> > https://www.enlightenment.org/robots.txt
> > 
> > User-agent: *
> > Allow: /
> > 
> > there is no site.xml ... or might it be that dokuwiki turns site.xml into a
> > "this topic doesnt exist" page... ? does that mess things up? (dokuwiki
> > turns any unknown page into that "this doesnt exist yet" allowing you to
> > create it)
> 
> That's OK then, because there would be no clash if site.xml does not exist.  
> :-)
> 
> I also noticed that there is not sitemap.xml, which is not necessary, but 
> would help to index pages that may have been missed out for some reason and 
> also propose a priority for each.

yeah - this is needed if we have unlinked pages to index. we don't though.
everything is simply linked. those links dont have anything like "nofollow".

https://www.enlightenment.org has:
  <a href="/docs">Docs</a>

https://www.enlightenment.org/docs has:
  <a href="/docs/efl/advanced/start" class="wikilink1"
title="docs:efl:advanced:start">Advanced EFL Topics</a>

https://www.enlightenment.org/docs/efl/advanced/start has:
  <a href="/docs/efl/advanced/dnd" class="wikilink1"
title="docs:efl:advanced:dnd">DND (Drag and Drop)</a>

https://www.enlightenment.org/docs/efl/advanced/dnd has a paragraph like:
  There are two applications visible on the screen. The first is a window with
a blue background and a gengrid containing 8 different images.

if i google for exactly the above string - nothing. every page in that chain
has <meta name="robots" content="index,follow"/> in the <head>...

garhhhr! why?

> I used to run a python script provided by Google some years ago, but with the 
> advent of CMS script like this should be part of the CMS engine.  Have a look 
> here 
> 
> https://support.google.com/webmasters/answer/156184?hl=en
> 
> and follow the link on the right hand side navigation menu to borrow some
> code for building a sitemap.xml

sure - i read about the stiemap.xml - yes. then every time anything changes on
e.org i have to provide it (and google limit you per month to i think 5 to 10
manual updates). that is really only needed if we have unlinked content. we
don't. :( this is a workaround to the core problem... what IS the core
problem? :(

> Here's one I built for you using an online generator, but I can't guarantee 
> that it is syntactically correct for Google/Bing/Yahoo requirements:
> ===================================================================
> <?xml version="1.0" encoding="UTF-8"?>
> <urlset
>       xmlns="http://www.sitemaps.org/schemas/sitemap/0.9";
>       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
>       xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
>             http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd";>
> <!-- created with Free Online Sitemap Generator www.xml-sitemaps.com -->
> 
> <url>
>   <loc>https://www.enlightenment.org/</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/start</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/about</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/download</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/contact</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/docs</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/about-enlightenment</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/contribute</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/about-efl</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/about-terminology</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/about-rage</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/about-edi</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/media</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/docs-efl-start</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/contact/arcanist</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/docs/c/start</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/docs/efl/start</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/docs/efl/advanced/start</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/_detail/e-logo-title.svg?id=media</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/docs/efl/mainloop</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/docs/efl/ecore_idlers</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/docs/efl/ref/c/key/evas_object_del</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/docs/efl/advanced/dnd</loc>
> </url>
> <url>
>   
> <loc>https://www.enlightenment.org/docs/efl/ref/c/key/evas_object_ref%28%29</loc>
> </url>
> <url>
>   <loc>https://www.enlightenment.org/docs/efl/ref/c/key/evas_object_ref</loc>
> </url>
> <url>
>   
> <loc>https://www.enlightenment.org/docs/efl/ref/c/key/evas_object_del%28%29</loc>
> </url>
> </urlset>
> ===================================================================
> 
> Google unfriendly URLs like "evas_object_del%28%29" are not going to help, so 
> you may want to edit those.  Then add it to the webroot and tell Google about 
> it, or it will (should) pick it up anyway at its next crawl.

i'm actually not worried about those... i'm worried about core pages with
actual content.


-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    ras...@rasterman.com


------------------------------------------------------------------------------
_______________________________________________
enlightenment-users mailing list
enlightenment-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-users

Reply via email to