Hi.
Im not sure to understand, If you need to see the metatags indexed you will
see your solr administration page, if you are using jetty
http://localhost:8989/solr/admin/schema.jsp or
http://localhost:8080/solr/admin/schema.jsp if is tomcat deployment. In this
page you will see the fields supported by solr. Is possible that no contain
any document because you need to reindex again like I said.
Tell me if you understand.


_____________________________________________________________________
Ing. Eyeris Rodriguez Rueda
Teléfono:837-3370
Universidad de las Ciencias Informáticas
_____________________________________________________________________

-----Mensaje original-----
De: ML mail [mailto:[email protected]] 
Enviado el: domingo, 13 de mayo de 2012 7:02 AM
Para: Ing. Eyeris Rodriguez Rueda; [email protected]
Asunto: Re: Indexing HTML metatags from Nutch into Solr

I will then try deactivating the parse-metatags plugin.... 

Btw do you or anyone know what modifications exactly are required on side of
Apache Solr to get the metatags working?

Regards



----- Original Message -----
From: Ing. Eyeris Rodriguez Rueda <[email protected]>
To: ML mail <[email protected]>; [email protected]
Cc: 
Sent: Friday, May 11, 2012 10:38 PM
Subject: Re: Indexing HTML metatags from Nutch into Solr

Hi.
I only have index-metatags plugins in my nutch-site.xml and is function
succesfully I also was trying with parse-metatags without positive result
and finaly dont use it.
also make sure that your schema in nutch is the same in solr.

if your index is not big you can erase the folder of your solr index and
nutch data.
nutch(crawldb, linkdb, segment)
solr(index, spellchecker).




**************************************************************************

----- Mensaje original -----
De: "ML mail" <[email protected]>
Para: "Ing. Eyeris Rodriguez Rueda" <[email protected]>, [email protected]
Enviados: Viernes, 11 de Mayo 2012 19:37:13
Asunto: Re: Indexing HTML metatags from Nutch into Solr

Hi,

Actually I have already done all that, as I followed the Nutch Wiki for this
purpose: http://wiki.apache.org/nutch/IndexMetatags

Now your suggestion about cleaning my segments as well as solr index then
re-index is a good idea. Could you just help me on the commands to achieve
these 3 steps?

Many thanks!



----- Original Message -----
From: Ing. Eyeris Rodriguez Rueda <[email protected]>
To: [email protected]; ML mail <[email protected]>
Cc:
Sent: Friday, May 11, 2012 7:55 PM
Subject: Re: Indexing HTML metatags from Nutch into Solr

Hello, I am using index-metatags plugins(I supose that you have
index-metatags plugins on nutch's plugins folder).
Fist you need to include on nutch-site some like this
|index-(basic|anchor|metatags|more)|
also you need to include the metadata names that you want to index(in this
file also):
<property>
    <name>metatags.names</name>
   
<value>category;keywords;author;comments;description;subject;last_modified</
value>
    <description>For plugin index-metatags: Indicate here the name of the
    html meta tag that should be
    parsed. Use a semicolon separated list if you want multiple
    tags, or use '*' to index all.
    Example: description;keywords;role
</description>
</property>
>I have only
this(category;keywords;author;comments;description;subject;last_modified).
after you have to configure your solrindex-mapping like this:
<field dest="subject" source="subject" /> <field dest="description"
source="description" /> <field dest="comments" source="comments" /> <field
dest="author" source="author"/> <field dest="keywords" source="keywords" />
<field dest="category" source="category" /> <field dest="lastModified"
source="lastModified"/>

I suggest clean your segments and solr index and reindex again.
I think that your problem will be solved with this.

****************************************************************************
************

----- Mensaje original -----
De: "ML mail" <[email protected]>
Para: [email protected]
Enviados: Viernes, 11 de Mayo 2012 6:40:36
Asunto: Indexing HTML metatags from Nutch into Solr

Hello,

I am using Nutch 1.4 with Solr 3.6.0 and would like to get the HTML keywords
and description metatags indexed into Solr. On the Nutch side I have
followed thehttp://wiki.apache.org/nutch/IndexMetatags to get nutch parsing
the extracting the metatags (using index-metatags and parse-metatags
plugins) but now when I run the solrindex they simply don't get indexed. 

In Solr I am using the schema.xml provided by Nutch and have added the
following fields for the metatags:
 
        <!-- fields for the metatags plugin -->
        <field name="metatag.description" type="text" stored="true"
indexed="true"/>
        <field name="metatag.keywords" type="text" stored="true"
indexed="true"/>

and have created a solrindex-mapping.xml file as follow:

<mapping>
<fields>
<field dest="description" source="metatag.description"/> <field
dest="keywords" source="metatag.keywords"/> </fields> </mapping>

the rest is pretty much a default install of Solr. So now my question is why
can't I see the metatags indexed in solr? Did I forget maybe to configure
something in Solr?

Any suggestions are welcome.


10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Reply via email to