Re: How to import data from Oracle to Solr

2012-07-17 Thread david . gaulin
I am currently out of the office until July 23.  In my absence you can contact 
Mike Clark (mike.cl...@norconex.com) for any Norconex related questions

Regards,

David Gaulin




Re: How to import data from Oracle to Solr

2012-07-17 Thread Karl Wright
Hi Wolfgang,

ManifoldCF is meant to handle a binary document and its metadata.  You
must provide the document.  Metadata is optional.

The JDBC connector does not currently support metadata.  In order to
index this, therefore, you will need to decide what should go into
your binary document from your database fields.  You can append
together multiple fields into one document by means of SQL, e.g. the
CONCAT operator or its Oracle equivalent.  This would go into one
field in Solr, then, which is what you'd search on.

Alternatively, if you really need separate indexed fields in Solr for
search reasons, you can request a JDBC connector enhancement to add
metadata support.  You'd still need a binary document, although you
could return a blank value for that.

So I guess the answer depends on what you are trying to do on the whole.

Karl


On Tue, Jul 17, 2012 at 6:27 AM, Wolfgang Schreiber
wolfgang.schrei...@isb-ag.de wrote:
 Hello,

 we are trying to ingest data from an Oracle database into Solr.
 We managed to insert docs into Solr but only document IDs are inserted and no
 other data fields.

 Can you provide an example how to setup the import job in ManifoldCF ?


 Assume we have the following initial situation:

 1) Our Oracle table looks something like:

 ADDRESS
 --
 ID  NUMBER
 ZIP NUMBER
 CITYVARCHAR(2)
 STREET  VARCHAR(2)


 2) In Solr's schema.xml we added the following fields for the database
 columns
 ...
 field name=ZIP type=int indexed=true stored=true /
 field name=City type=string indexed=true stored=true /
 field name=Street type=string indexed=true stored=true /
 ...


 So here are our questions:

 * How do we have to setup the queries for the ManifoldCF job?
   In particular how exactly must the seeding query and the data query look
 like?

 * How do the Solr field mappings look like?


 We read your online documentation as well as your MEAP book but could not
 find a workíng example for a successful import between Oracle and Solr.
 Any help is welcome!

 Best regards
 Wolfgang


AW: How to import data from Oracle to Solr

2012-07-17 Thread Wolfgang Schreiber
Hello Karl,

thank you very much for your quick answer!

So if I understand correctly ...

1) ... all mappings added to the Solr Field Mapping tab are ignored in case
of a JDBC resource connector?

2) Our data query must look somehow like (regarding that || is Oracle's
concatenation operator):
   SELECT ID AS $(IDCOLUMN), ADDRESS_URL AS $(URLCOLUMN), 
   'ZIP:' || ZIP || ';city:' || CITY || ';street:' || STREET 
   AS $(DATACOLUMN) FROM ADDRESS WHERE ID IN $(IDLIST)
   
   This would result into DATACOLUMN values like:
   ZIP:70173;City:Stuttgart;Street:Heilbronner
   
We tried this statement and we got the data into the text field of our Solr
index.
It seems we are one step further!

Thank you for your help! Best regards
Wolfgang


-Ursprüngliche Nachricht-
Von: Karl Wright [mailto:daddy...@gmail.com]
Gesendet: Di 17.07.2012 12:42
An: user@manifoldcf.apache.org
Betreff: Re: How to import data from Oracle to Solr
 
Hi Wolfgang,

ManifoldCF is meant to handle a binary document and its metadata.  You
must provide the document.  Metadata is optional.

The JDBC connector does not currently support metadata.  In order to
index this, therefore, you will need to decide what should go into
your binary document from your database fields.  You can append
together multiple fields into one document by means of SQL, e.g. the
CONCAT operator or its Oracle equivalent.  This would go into one
field in Solr, then, which is what you'd search on.

Alternatively, if you really need separate indexed fields in Solr for
search reasons, you can request a JDBC connector enhancement to add
metadata support.  You'd still need a binary document, although you
could return a blank value for that.

So I guess the answer depends on what you are trying to do on the whole.

Karl


On Tue, Jul 17, 2012 at 6:27 AM, Wolfgang Schreiber
wolfgang.schrei...@isb-ag.de wrote:
 Hello,

 we are trying to ingest data from an Oracle database into Solr.
 We managed to insert docs into Solr but only document IDs are inserted and
no
 other data fields.

 Can you provide an example how to setup the import job in ManifoldCF ?


 Assume we have the following initial situation:

 1) Our Oracle table looks something like:

 ADDRESS
 --
 ID  NUMBER
 ZIP NUMBER
 CITYVARCHAR(2)
 STREET  VARCHAR(2)


 2) In Solr's schema.xml we added the following fields for the database
 columns
 ...
 field name=ZIP type=int indexed=true stored=true /
 field name=City type=string indexed=true stored=true /
 field name=Street type=string indexed=true stored=true /
 ...


 So here are our questions:

 * How do we have to setup the queries for the ManifoldCF job?
   In particular how exactly must the seeding query and the data query look
 like?

 * How do the Solr field mappings look like?


 We read your online documentation as well as your MEAP book but could not
 find a workíng example for a successful import between Oracle and Solr.
 Any help is welcome!

 Best regards
 Wolfgang