from:"Evan Pease"

Re: Hierarchical faceting

2014-11-17 Thread Evan Pease

I'm looking to see if Solr has any in-built tokenizer that splits the
tokens
and prepends with the depth information. I'd like to avoid building depth
information into the filed values if Solr already has something that can be
used.

So the goal is to find out the level of the tree for each category? You
could determine this in the UI by splitting the category facet value string
by the separator.

As you're aware, when you query a field indexed using
solr.PathHierarchyTokenizerFactory
you still get the full path category path back as a facet value.

For example, if a user navigates to Phy:
fq={!term f=category}NonFic/Sci/Phy

The facet values that are returned will look like this (made up counts):

lst name=category
int name=NonFic/Sci/Phy10/int
int name=NonFic/Sci/Phy/Quantum10/int
/lst

You could find out the level by doing .split(/).length on each value.

ECP

On Mon, Nov 17, 2014 at 9:25 PM, Jason Hellman jhellman.innov...@gmail.com
wrote:

I realize you want to avoid putting depth details into the field values,
but something has to imply the depth. So with that in mind, here is
another approach (with the assumption that you are chasing down a single
branch of a tree (and all its subbranch offshoots)),

Use dynamic fields
Step from one level to the next with a simple increment
Build the facet for the next level on the call
The UI needs only know the current level

This would possibly be as so:

step_fieldname_n

With a dynamic field configuration of:

step_*

The content of the step_fieldname_n field would either be the strong of
the field value or the delimited path of the current level (as suited to
taste). Either way, most likely a fieldType of String (or some variation
thereof)

The UI would then call:

facet.field=step_fieldname_n+1

And the UI would need to be aware to carry the n+1 into the fq link
verbiage:

fq=step_fieldname_n+1:facetvalue

The trick of all of this is that you must build your index with the depth
of your hierarchy in mind to place the values into the suitable fields.
You could, of course, write an UpdateProcessor to accomplish this if that
seems fitting.

Jason

On Nov 17, 2014, at 12:22 PM, Alexandre Rafalovitch arafa...@gmail.com
wrote:

You might be able to stick in a couple of PatternReplaceFilterFactory
in a row with regular expressions to catch different levels.

Something like:

filter class=solr.PatternReplaceFilterFactory
pattern=^[^0-9][^/]+/[^/]/[^/]+$ replacement=2$0 /
filter class=solr.PatternReplaceFilterFactory
pattern=^[^0-9][^/]+/[^/]$ replacement=1$0 /
...

I did not test this, you may need to escape some thing or put explicit
groups in there.

Regards,
Alex.
P.s.
http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/pattern/PatternReplaceFilterFactory.html

Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853

On 17 November 2014 15:01, rashmy1 rashmy.appanerava...@siemens.com
wrote:
Hi Alexandre,
Yes, I've read this post and that's the 'Option1' listed in my initial
post.

I'm looking to see if Solr has any in-built tokenizer that splits the
tokens
and prepends with the depth information. I'd like to avoid building
depth
information into the filed values if Solr already has something that
can be
used.

Thanks!

--
View this message in context:
http://lucene.472066.n3.nabble.com/Hierarchical-faceting-tp4169263p4169536.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Hierarchical faceting

2014-11-14 Thread Evan Pease

Hi Rashmi,

Here is some more details on how to use PathHierarchyTokenizer that Oleg
provided the link to.

If this is your document:

 *Sample document*
 doc
 name=Pbook1
 category=NonFic/Sci/Phy/Quantum
 author=ABC
 price=20.00
 doc

Then, in your schema.xml:

field name=category type=tree indexed=true stored=true
multiValued=true/
fieldType name=tree class=solr.TextField
  analyzer type=index
tokenizer class=solr.PathHierarchyTokenizerFactory delimiter=/ /
  /analyzer
  analyzer type=query
tokenizer class=solr.KeywordTokenizerFactory /
  /analyzer
/fieldType

Then, in your Solr query, you can simply add:

facet=true
facet.field=category

You should see a facet that contains each level of the taxonomy with counts.

To navigate the taxonomy you add filter queries using the part of the path
you want narrow the results down to (values from the category facet).

So, for example a user clicks on NonFic

facet=true
facet.field=category
fq={!term f=category}NonFic

Then NonFic/Sci

fq={!term f=category}NonFic/Sci

Then NonFic/Sci/Phy

fq={!term f=category}NonFic/Sci/Phy

etc..

If you only want to display the leaf level category and indent child
categories you can easily do this in your UI by splitting the facet value
on your separator, / in this case.


Thanks,
Evan



On Nov 14, 2014 8:06 PM, Oleg Savrasov osavra...@griddynamics.com wrote:

 Hi Rashmi,

 I believe you are looking for PathHierarchyTokenizer,
 see

 https://lucene.apache.org/core/4_0_0/analyzers-common/org/apache/lucene/analysis/path/PathHierarchyTokenizer.html

 Oleg

 2014-11-14 17:53 GMT-05:00 rashmy1 rashmy.appanerava...@siemens.com:

  Hello,
  I'm trying to setup Solr for fetching hierarchical facets.
  Please advice which of the below approaches should be followed for my
  scenario.
  *Scenario:
  *
  NonFic
  Hist
  HistBook1
  HistBook2
  Sci
  Phy
  Quantum
  Pbook1
  Pbook2
  Thermodynamics
  Pbook3
  Pbook4
  Chem
  Cbook1
  Math
  Mbook1
  Fic
  Mystery
  Mybook1
  Childrens
  Chbook1
  Chbook2
 
  *Sample document*
  doc
  name=Pbook1
  category=NonFic/Sci/Phy/Quantum
  author=ABC
  price=20.00
  doc
 
  *Requirements:*
  -Show drill down facets
  -If user searched for *, the initial set of facets to be shown are
  'NonFic' and 'Fic'
  -If user selects facet 'NonFic', we then show the facets 'Hist' and 'Sci'
  only.
 
  *Option1:*
  /Solr schema:/
  field indexed=true multiValued=true name=category required=true
  stored=true type=string/
  /Document supplied for indexing:/
  doc
  name=Pbook1
  category=0/NonFic
  category=1/NonFic/Sci
  category=2/NonFic/Sci/Phy
  category=3/NonFic/Sci/Phy/Quantum
  category=0/Other (a book can belong to multiple categories)
  author=ABC
  price=20.00
  doc
  With Option2, we can do a drill down facet query.
  For example, if we give facet.prefix=NonFic/Sci/, the facet results are:
  NonFic/Sci/Phy
  NonFic/Sci/Chem
  NonFic/Sci/Math
  The only issue is that I have to take care of generating all possible
 path
  information for 'category'
 
  *Option2:*
  /Solr schema:/
  fieldType class=solr.TextField name=path
analyzer type=index
  tokenizer class=solr.PathHierarchyTokenizerFactory
  delimiter=//
/analyzer
  /fieldType
  field indexed=true multiValued=true name=category required=true
  stored=true type=path/
  /Document supplied for indexing:/
  doc
  name=Pbook1
  category=NonFic/Sci/Phy/Quantum
  author=ABC
  price=20.00
  doc
  With Option2, we can do facet query but it returns all possible
 combination
  of paths.
  For example, if we give facet.prefix=Fic, the facet results are:
  Fic (3)
  Fic/Mystery (1)
  Fic/Childrens (2)
 
 
  I'm looking to supply a doc with just a single entry (like
  'category=NonFic/Sci/Phy/Quantum' ) and be able to do a drill down query.
  Is
  there some existing Solr tokernizer which takes care of generating all
  possibly combinations which indexing instead of having to generating them
  as
  part of doc creation?
 
  Thanks
 
 
 
 
 
  --
  View this message in context:
  http://lucene.472066.n3.nabble.com/Hierarchical-faceting-tp4169263.html
  Sent from the Solr - User mailing list archive at Nabble.com.

Re: Hierarchical faceting

Re: Hierarchical faceting

2 matches

Site Navigation

Mail list logo

Footer information