These days the best practice for a 'drill-down' facet in a UI is to encode both the unique value of the facet and the displayable string into one facet value. In the UI, you unpack and show the display string, and search with the full facet string.

If you want to also do date ranges, make a separate matching 'date' field. This will store the date twice. Solr schema design is all about denormalizing.

Tim Gilbert wrote:

Hi guys,

*Question:*

What is the best way to create a solr schema which supports a ‘multivalue’ where the value is a two item array of event category and a date. I want to have faceted searches, counts and Date Range ability on both the category and the dates.

*Details:*

This is a person database where Person can have details about them (like address) and Person have many “Events”. Events have a category (type of event) and a Date for when that event occurred. At the bottom you will see a simple diagram showing the relationship. Briefly, a Person has many Events and Events have a single category and a single person.

What I would like to be able to do is:

Have a facet which shows all of the event categories, with a ‘sub-facet’ that show Category + date. For example, if a Category was “Attended Conference” and date was 2008-09-08, I’d be able to show a count of all “Attended Conference”, then have a tree type control and show the years (for example):

Eg.

+ Attended Conference (1038)

|

+---- 2010 (100)

+--- 2009 (134)

+--- 2008 (234)

|

+ Another Event Category (23432)

|

+---------2010 (234)

+--------2009 (245)

Etc.

For scale, I expect to have < 100 “Event Categories” and < a million person_event records on < 250,000 persons. I don’t care very much about disk space, so if it’s a 1 GB or 100 GB due to indexing, that’s okay if the solution works (and its fast! J)

*Solutions I looked at:*

    * I looked at poly but they seem to be a fixed length and appeared
      to be the same type. Typical use case was latitude & longitude.
      I don’t think this will work because there are a variable number
      of events attached to a person.
    * I looked at multiValued but it didn’t seem to permit two fields
      having a relationship, ie. Event Category & Event Date. It
      seemed to me that they need to be broken out. That’s not
      necessarily a bad thing, but it didn’t seem ideal.
    * I thought about concatenating category & date to create a fake
      fields strictly for faceting purposes, but I believe that will
      break date ranges. Eg. EventCategoryId + “|” + Date = 1|2009 as
      a facet would allow me to show counts for that event type. Seems
      a bit unwieldy to me…

What’s the groups advice for handling this situation in the best way?

Thanks in advance, as always sorry if this question has been asked and answered a few times already. I googled for a few hours before writing this… but things change so fast with Solr that any article older than a year was suspect to me, also there are so many patches that provide additional functionality…

Tim

Schema:

Reply via email to