Thorsten Scherler wrote:
El mié, 14-01-2009 a las 12:11 +0100, Andrzej Bialecki escribió:
Hi devs,

I found this Forrest plugin at the Forrest site. If you guys have a moment to spare I'd really appreciate your advice.

I'm a complete newbie to Forrest, the only things I know how to do is to fill in the blanks in the default site xdocs and generate static html. It's not much, I'm afraid.

Should be enough. ;)

Since the plugin is still in the whiteboard you need to use the TRUNK of
forrest. Best to get started with the plugin:

cd
$FORREST_HOME/whiteboard/plugins/org.apache.forrest.plugin.output.solr
forrest run

http://localhost:8888/index.html -> here you find some samples and basic
instructions.

Ok, so far so good. I was able to complete these steps, and I can see the documentation.


Now, I need to index the content of a Forrest site in Solr, using a custom schema - e.g. the "id" in my case should be equivalent to the full URL of the page of the deployed site.

You have seen http://wiki.apache.org/solr/SolrForrest and
http://forrest.apache.org/pluginDocs/plugins_0_80/org.apache.forrest.plugin.output.solr/

Yes, but that documentation is not helpful for a newbie like me. It lists some configuration snippets without telling where to put them.

Basically, I need a step-by-step instruction how to generate _static_ Solr documents output, exactly like the one here http://192.168.0.251:8888/index-creation.solr.add - but this one is generated dynamically, i.e. requires a running instance of forrest, and I need to generate it statically.


First, I'm stuck conceptually - sitting in the top-level dir of the forrest site, what is it that I have to do to produce a file with the Solr <add> documents?

Actually that is doing the plugin to you.
http://svn.apache.org/viewvc/forrest/trunk/whiteboard/plugins/org.apache.forrest.plugin.output.solr/output.xmap?view=markup
...
<!-- Output xdocs as solr docs -->
<map:match pattern="**.solr">
 <map:generate src="cocoon://{1}.xml"/>
 <map:transform src="{lm:solr.transform.xdocs.solrDoc}">
  <map:parameter name="document-url" value="{1}.xml"/>
  <map:parameter name="project" value="{properties:project.name}"/>
 </map:transform>
 <map:serialize/>
</map:match>

I'm not sure what this means - does it mean that I have to specify somewhere the list of document names with .xml replaced by .solr ?


You are talking about to extend the ./resources/stylesheets/xdocs-to-solrDoc.xsl with your custom attributes. First have a look at the plugins xsl to get an idea about how we are doing things.

Now copy the file to your project into your stylesheet dir (default is 
src/documentation/resources/stylesheets/ = {project.stylesheets-dir}).

Let forrest know that you want to provide a custom location by adding the following in 
your project locationmap.xml after the "locator" element:
<match pattern="solr.transform.xdocs.solrDoc">
 <location src="{project.stylesheets-dir}/xdocs-to-solrDoc.xsl"/>
</match>

From here you need to implement your logic.

I already added the Solr output plugin to skinconf.xml. I discovered that I can get this via webapp, but I'd rather not actually run the webapp.

hmm, skinconf.xml has nothing to do with the plugin. Where did you get
the expression that you need to edit this file? You need to add the
plugin to "project.required.plugins".

Ok, I did that. Still after running 'forrest site' I don't see the solr documents anywhere.


Second, how can I modify the schema of the produced documents, so that e.g. the id is the full URL - a configurable root URL plus the page name, and so that I can add other metadata to the docs?

You will need to create your own xsl to override the default one as
described above.

Thanks in advance for any help that you can provide!

Please keep on asking if this are still not very clear.

Thanks for your help. I'm afraid this is still not very clear...

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com