Upayavira wrote:

...

4) I just started thinking about your excludes code (assuming that link gathering does start working again). Basically, there's a number of things one can exclude upon - source URI, source prefix, full source URI (prefix and URI), final destination URI . How about something like:

<exclude type="regexp| wildcard" src="source-uri | source-prefix | full-source-uri | dest-uri" match="<pattern>"/>
<include type="regexp| wildcard" src="source-uri | source-prefix | full-source-uri | dest-uri" match="<pattern>"/>

I'd be happy with a simple 'ignore this link', but wildcards would be great.

I'm a bit confused by all the @src types though. Is 'dest-uri' the final
filesystem destination? Is there anything possible with src="dest-uri"
that isn't possible otherwise? Does 'src-prefix' mean "ignore URIs
starting with this prefix"? If so, why not just use a wildcard?

The thing is, you might want to exclude a certain URL from going to one destination but not another, so you'd need to specify a wildcard on either source or destination. However, given that a wildcard can be used to deal with prefixes, we don't need to specifically worry about prefixes. So, I propose:


<exclude-source match="<wildcard pattern>"/>
<exclude-destination match="<wildcard pattern>"/>
<exclude-source match="<wildcard pattern>"/>
<exclude-destination match="<wildcard pattern>"/>

I don't want to use <exclude type="source" ...> as I wan to reserve the type attribute for specifying whether to use a wildcard or regexp matcher.

Thoughts?

I've got some basic code in place to do includes/excludes - I'll keep you posted.

I've just run my code through, and it seems to have worked. I'll give it a bit more testing later today and commit it either this evening or tomorrow.


It is very simple - I haven't yet implemented 'destination' excludes, and have only done wildcard excludes. And the matching happens with the 'absolute' url, i.e. including any source prefix. I've also implemented includes in the same way, but have not yet tested it.

I've made it so that if both includes and excludes are present, a URL is first checked to see if it should be included, and then to see if it should be excluded. So you might say <include pattern="subsite/**"/> and <exclude pattern="subsite/images/**"/>.

I have tested the following, which generates the docs, but without any images.

<exclude pattern="**/images/**"/>
<uri type="append" src-prefix="docs/" src="index.html" dest="build/dest/" />


Regards, Upayavira




Reply via email to