This looks all right to me, but I might be missing something.
Which version/build of SOLR are you using?

Chantal

Joel Nylund schrieb:
Thanks Chantal, I will keep that in mind for tuning,

for sql I figured  way to combine them into one row using concat, but
I still seem to be having an issue splitting them:

Db now returns as one column categoryType:
TOPIC,LANGUAGE

but my solr result, if you note the item in categoryType  all seem to
be within one str, I would expect it to be in multiple strings within
the array, is this assumption wrong?

<doc>
−
<arr name="categoryType">
<str>TOPIC,LANGUAGE</str>
</arr>
<str name="id">40</str>
<str name="title">feed title</str>
</doc>


Here is my import:
   <document name="doc">
         <entity name="item"
        query="SELECT f.id, f.title
                FROM Feed f
            <field column="id" name="id" />
             <field column="title" name="title" />
                        <entity name="category" query="select cfcr.feedId,
group_concat(cfcr.categoryType) as categoryType
                                                from CFR cfcr
                                                where
                                                cfcr.feedId = '${item.id}' AND
                                                group by cfcr.feedId">
                                        <field column="categoryType" 
name="categoryType"
splityBy="," />
                    </entity>

      </entity>

In schema:
        <field name="categoryType" type="text" indexed="true" stored="true"
required="false" multiValued="true"/>
        <field name="categoryName" type="text" indexed="true" stored="true"
required="false" multiValued="true"/>


what am I missing?

thanks
Joel


On Oct 30, 2009, at 10:00 AM, Chantal Ackermann wrote:

That depends a bit on your database, but it is tricky and might not
be performant.

If you are more of a Java developer, you might prefer retrieving
mutliple rows per SOLR document from your dataSource (join on your
category and main table), and aggregate them in your custom
EntityProcessor. I got a far(!) better performance retrieving
everything in one query and doing the aggregation in Java. But this
is, of course, depending on your table structure and data.

Noble Paul helped me with the custom EntityProcessor, and it turned
out quite easy. Have a look at the thread with the heading from this
mailing list (SOLR-USER):
DataImportHandler / Import from DB : one data set comes in multiple
rows

Cheers,
Chantal


Joel Nylund schrieb:
thanks, but im confused how I can aggregate across rows, I dont know
of any easy way to get my db to return one row for all the categories
(given the hint from your other email), I have split the category
query into a separate entity, but its returning multiple rows, how do
I combine multiple rows into 1 index entity?
thanks
Joel
On Oct 29, 2009, at 8:58 PM, Avlesh Singh wrote:
In the database this is modeled a a 1-N where category table has
the
mapping of feed to category
I need to be able to query , give me all the feeds in any given
category.
How can I best model this in solr?
Seems like multiValued field might help, but how would I populate
it, and
would the query above work?.

Yes you are right. A multivalued field for "categories" is the
answer.

For populating in the index -

 1. If you use DIH to populate your indexes and your datasource is a
 database then you can use DIH's RegexTransformer on an aggregated
list of
 categories. e.g. if your database query retruns "a,b,c,d" in a
column called
 "db_categories", this is how you would put it in DIH's data-config
file -
 <field column="db_categories" name="categories" splityBy="," />.
 2. If you "add" documents to Solr yourself  multiple values for
the field
 can be specified as an array or list of values in the
SolrInputDocument.

A multivalued field provides the same faceting and searching
capabilites
like regular fields. There is no special syntax.

Cheers
Avlesh

On Fri, Oct 30, 2009 at 4:55 AM, Joel Nylund <jnyl...@yahoo.com>
wrote:

Hi,

I have one index so far which contains feeds.  I have been able to
de-normalize several tables and map this data onto the feed entity.
There is
one tricky problem that I need help on.

Feeds have 1 - many categories.

So Lets say we have Category1, Category2 and Category3

Feed 1 - is in Category 1
Feed 2 is in category2 and category3
Feed 3 is in category2
Feed 4 has no category

In the database this is modeled a a 1-N where category table has
the
mapping of feed to category

I need to be able to query , give me all the feeds in any given
category.

How can I best model this in solr?

Seems like multiValued field might help, but how would I populate
it, and
would the query above work?.

thanks
Joel



Reply via email to