Preventing field data from being loaded into page cache

2023-10-20 Thread Justin Borromeo
Is there any way to keep field data files out of the operating system's
page cache? We only use fdt for highlighting and don't need to keep it warm
in memory.  From what I understand, the operating system is in control of
what files get loaded into the page cache. Does Lucene have any mechanisms
to explicitly prevent them from being cached?  Is it even possible with
Java?

Thanks,
Justin Borromeo


Re: When to use StringField and when to use FacetField for categorization?

2023-10-20 Thread Michael Wechner

thanks very much for this additional information, Marc!

Am 20.10.23 um 20:30 schrieb Marc D'Mello:

Just following up on Mike's comment:



It used to be that the "doc values" based faceting did not support


arbitrary hierarchy, but I think that was fixed at some point.


Yeah it was fixed a year or two ago, SortedSetDocValuesFacetField supports
hierarchical faceting, I think you just need to enable it in the
FacetsConfig. One thing to keep in mind is even though SSDV faceting
doesn't require a taxonomy index, it still requires a
SortedSetDocValuesReaderState to be maintained, which can be a little bit
expensive to create, but only needs to be done once. This benchmark code

serves as a pretty basic example of SSDV/hierarchical SSDV faceting.

On Fri, Oct 20, 2023 at 7:09 AM Michael Wechner 
wrote:


cool, thank you very much!

Michael



Am 20.10.23 um 15:44 schrieb Michael McCandless:

You can use either the "doc values" implementation for facets
(SortedSetDocValuesFacetField), or the "taxonomy" implementation
(FacetField, in which case, yes, you need to create a TaxonomyWriter).

It used to be that the "doc values" based faceting did not support
arbitrary hierarchy, but I think that was fixed at some point.

Mike McCandless

http://blog.mikemccandless.com


On Fri, Oct 20, 2023 at 9:03 AM Michael Wechner <

michael.wech...@wyona.com>

wrote:


Hi Mike

Thanks for your feedback!

IIUC in order to have the actual advantages of Facets one has to
"connect" it with a TaxonomyWriter

FacetsConfig config = new FacetsConfig();
DirectoryTaxonomyWriter taxoWriter = new

DirectoryTaxonomyWriter(taxoDir);

indexWriter.addDocument(config.build(taxoWriter, doc));

right?

Thanks

Michael




Am 20.10.23 um 12:19 schrieb Michael McCandless:

There are some differences.

StringField is indexed into the inverted index (postings) so you can do
efficient filtering.  You can also store in stored fields to retrieve.

FacetField does everything StringField does (filtering, storing

(maybe?)),

but in addition it stores data for faceting.  I.e. you can compute

facet

counts or simple aggregations at search time.

FacetField is also hierarchical: you can filter and facet by different
points/levels of your hierarchy.

Mike McCandless

http://blog.mikemccandless.com


On Fri, Oct 20, 2023 at 5:43 AM Michael Wechner <

michael.wech...@wyona.com>

wrote:


Hi

I have found the following simple Facet Example




https://github.com/apache/lucene/blob/main/lucene/demo/src/java/org/apache/lucene/demo/facet/SimpleFacetsExample.java

whereas for a simple categorization of documents I currently use
StringField, e.g.

doc1.add(new StringField("category", "book"));
doc1.add(new StringField("category", "quantum_physics"));
doc1.add(new StringField("category", "Neumann"))
doc1.add(new StringField("category", "Wheeler"))

doc2.add(new StringField("category", "magazine"));
doc2.add(new StringField("category", "astro_physics"));

which works well, but would it be better to use Facets for this, e.g.

doc1.add(new FacetField("media-type", "book"));
doc1.add(new FacetField("topic", "physics", "quantum");
doc1.add(new FacetField("author", "Neumann");
doc1.add(new FacetField("author", "Wheeler");

doc1.add(new FacetField("media-type", "magazine"));
doc1.add(new FacetField("topic", "physics", "astro");

?

IIUC the StringField approach is more general, whereas the FacetField
approach allows to do a more specific categorization / search.
Or do I misunderstand this?

Thanks

Michael



-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org





-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: When to use StringField and when to use FacetField for categorization?

2023-10-20 Thread Marc D'Mello
Just following up on Mike's comment:


> It used to be that the "doc values" based faceting did not support
>
arbitrary hierarchy, but I think that was fixed at some point.


Yeah it was fixed a year or two ago, SortedSetDocValuesFacetField supports
hierarchical faceting, I think you just need to enable it in the
FacetsConfig. One thing to keep in mind is even though SSDV faceting
doesn't require a taxonomy index, it still requires a
SortedSetDocValuesReaderState to be maintained, which can be a little bit
expensive to create, but only needs to be done once. This benchmark code

serves as a pretty basic example of SSDV/hierarchical SSDV faceting.

On Fri, Oct 20, 2023 at 7:09 AM Michael Wechner 
wrote:

> cool, thank you very much!
>
> Michael
>
>
>
> Am 20.10.23 um 15:44 schrieb Michael McCandless:
> > You can use either the "doc values" implementation for facets
> > (SortedSetDocValuesFacetField), or the "taxonomy" implementation
> > (FacetField, in which case, yes, you need to create a TaxonomyWriter).
> >
> > It used to be that the "doc values" based faceting did not support
> > arbitrary hierarchy, but I think that was fixed at some point.
> >
> > Mike McCandless
> >
> > http://blog.mikemccandless.com
> >
> >
> > On Fri, Oct 20, 2023 at 9:03 AM Michael Wechner <
> michael.wech...@wyona.com>
> > wrote:
> >
> >> Hi Mike
> >>
> >> Thanks for your feedback!
> >>
> >> IIUC in order to have the actual advantages of Facets one has to
> >> "connect" it with a TaxonomyWriter
> >>
> >> FacetsConfig config = new FacetsConfig();
> >> DirectoryTaxonomyWriter taxoWriter = new
> DirectoryTaxonomyWriter(taxoDir);
> >> indexWriter.addDocument(config.build(taxoWriter, doc));
> >>
> >> right?
> >>
> >> Thanks
> >>
> >> Michael
> >>
> >>
> >>
> >>
> >> Am 20.10.23 um 12:19 schrieb Michael McCandless:
> >>> There are some differences.
> >>>
> >>> StringField is indexed into the inverted index (postings) so you can do
> >>> efficient filtering.  You can also store in stored fields to retrieve.
> >>>
> >>> FacetField does everything StringField does (filtering, storing
> >> (maybe?)),
> >>> but in addition it stores data for faceting.  I.e. you can compute
> facet
> >>> counts or simple aggregations at search time.
> >>>
> >>> FacetField is also hierarchical: you can filter and facet by different
> >>> points/levels of your hierarchy.
> >>>
> >>> Mike McCandless
> >>>
> >>> http://blog.mikemccandless.com
> >>>
> >>>
> >>> On Fri, Oct 20, 2023 at 5:43 AM Michael Wechner <
> >> michael.wech...@wyona.com>
> >>> wrote:
> >>>
>  Hi
> 
>  I have found the following simple Facet Example
> 
> 
> 
> >>
> https://github.com/apache/lucene/blob/main/lucene/demo/src/java/org/apache/lucene/demo/facet/SimpleFacetsExample.java
>  whereas for a simple categorization of documents I currently use
>  StringField, e.g.
> 
>  doc1.add(new StringField("category", "book"));
>  doc1.add(new StringField("category", "quantum_physics"));
>  doc1.add(new StringField("category", "Neumann"))
>  doc1.add(new StringField("category", "Wheeler"))
> 
>  doc2.add(new StringField("category", "magazine"));
>  doc2.add(new StringField("category", "astro_physics"));
> 
>  which works well, but would it be better to use Facets for this, e.g.
> 
>  doc1.add(new FacetField("media-type", "book"));
>  doc1.add(new FacetField("topic", "physics", "quantum");
>  doc1.add(new FacetField("author", "Neumann");
>  doc1.add(new FacetField("author", "Wheeler");
> 
>  doc1.add(new FacetField("media-type", "magazine"));
>  doc1.add(new FacetField("topic", "physics", "astro");
> 
>  ?
> 
>  IIUC the StringField approach is more general, whereas the FacetField
>  approach allows to do a more specific categorization / search.
>  Or do I misunderstand this?
> 
>  Thanks
> 
>  Michael
> 
> 
> 
>  -
>  To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>  For additional commands, e-mail: java-user-h...@lucene.apache.org
> 
> 
> >>
> >> -
> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
> >>
> >>
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


Re: When to use StringField and when to use FacetField for categorization?

2023-10-20 Thread Michael Wechner

cool, thank you very much!

Michael



Am 20.10.23 um 15:44 schrieb Michael McCandless:

You can use either the "doc values" implementation for facets
(SortedSetDocValuesFacetField), or the "taxonomy" implementation
(FacetField, in which case, yes, you need to create a TaxonomyWriter).

It used to be that the "doc values" based faceting did not support
arbitrary hierarchy, but I think that was fixed at some point.

Mike McCandless

http://blog.mikemccandless.com


On Fri, Oct 20, 2023 at 9:03 AM Michael Wechner 
wrote:


Hi Mike

Thanks for your feedback!

IIUC in order to have the actual advantages of Facets one has to
"connect" it with a TaxonomyWriter

FacetsConfig config = new FacetsConfig();
DirectoryTaxonomyWriter taxoWriter = new DirectoryTaxonomyWriter(taxoDir);
indexWriter.addDocument(config.build(taxoWriter, doc));

right?

Thanks

Michael




Am 20.10.23 um 12:19 schrieb Michael McCandless:

There are some differences.

StringField is indexed into the inverted index (postings) so you can do
efficient filtering.  You can also store in stored fields to retrieve.

FacetField does everything StringField does (filtering, storing

(maybe?)),

but in addition it stores data for faceting.  I.e. you can compute facet
counts or simple aggregations at search time.

FacetField is also hierarchical: you can filter and facet by different
points/levels of your hierarchy.

Mike McCandless

http://blog.mikemccandless.com


On Fri, Oct 20, 2023 at 5:43 AM Michael Wechner <

michael.wech...@wyona.com>

wrote:


Hi

I have found the following simple Facet Example




https://github.com/apache/lucene/blob/main/lucene/demo/src/java/org/apache/lucene/demo/facet/SimpleFacetsExample.java

whereas for a simple categorization of documents I currently use
StringField, e.g.

doc1.add(new StringField("category", "book"));
doc1.add(new StringField("category", "quantum_physics"));
doc1.add(new StringField("category", "Neumann"))
doc1.add(new StringField("category", "Wheeler"))

doc2.add(new StringField("category", "magazine"));
doc2.add(new StringField("category", "astro_physics"));

which works well, but would it be better to use Facets for this, e.g.

doc1.add(new FacetField("media-type", "book"));
doc1.add(new FacetField("topic", "physics", "quantum");
doc1.add(new FacetField("author", "Neumann");
doc1.add(new FacetField("author", "Wheeler");

doc1.add(new FacetField("media-type", "magazine"));
doc1.add(new FacetField("topic", "physics", "astro");

?

IIUC the StringField approach is more general, whereas the FacetField
approach allows to do a more specific categorization / search.
Or do I misunderstand this?

Thanks

Michael



-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org





-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: When to use StringField and when to use FacetField for categorization?

2023-10-20 Thread Michael McCandless
You can use either the "doc values" implementation for facets
(SortedSetDocValuesFacetField), or the "taxonomy" implementation
(FacetField, in which case, yes, you need to create a TaxonomyWriter).

It used to be that the "doc values" based faceting did not support
arbitrary hierarchy, but I think that was fixed at some point.

Mike McCandless

http://blog.mikemccandless.com


On Fri, Oct 20, 2023 at 9:03 AM Michael Wechner 
wrote:

> Hi Mike
>
> Thanks for your feedback!
>
> IIUC in order to have the actual advantages of Facets one has to
> "connect" it with a TaxonomyWriter
>
> FacetsConfig config = new FacetsConfig();
> DirectoryTaxonomyWriter taxoWriter = new DirectoryTaxonomyWriter(taxoDir);
> indexWriter.addDocument(config.build(taxoWriter, doc));
>
> right?
>
> Thanks
>
> Michael
>
>
>
>
> Am 20.10.23 um 12:19 schrieb Michael McCandless:
> > There are some differences.
> >
> > StringField is indexed into the inverted index (postings) so you can do
> > efficient filtering.  You can also store in stored fields to retrieve.
> >
> > FacetField does everything StringField does (filtering, storing
> (maybe?)),
> > but in addition it stores data for faceting.  I.e. you can compute facet
> > counts or simple aggregations at search time.
> >
> > FacetField is also hierarchical: you can filter and facet by different
> > points/levels of your hierarchy.
> >
> > Mike McCandless
> >
> > http://blog.mikemccandless.com
> >
> >
> > On Fri, Oct 20, 2023 at 5:43 AM Michael Wechner <
> michael.wech...@wyona.com>
> > wrote:
> >
> >> Hi
> >>
> >> I have found the following simple Facet Example
> >>
> >>
> >>
> https://github.com/apache/lucene/blob/main/lucene/demo/src/java/org/apache/lucene/demo/facet/SimpleFacetsExample.java
> >>
> >> whereas for a simple categorization of documents I currently use
> >> StringField, e.g.
> >>
> >> doc1.add(new StringField("category", "book"));
> >> doc1.add(new StringField("category", "quantum_physics"));
> >> doc1.add(new StringField("category", "Neumann"))
> >> doc1.add(new StringField("category", "Wheeler"))
> >>
> >> doc2.add(new StringField("category", "magazine"));
> >> doc2.add(new StringField("category", "astro_physics"));
> >>
> >> which works well, but would it be better to use Facets for this, e.g.
> >>
> >> doc1.add(new FacetField("media-type", "book"));
> >> doc1.add(new FacetField("topic", "physics", "quantum");
> >> doc1.add(new FacetField("author", "Neumann");
> >> doc1.add(new FacetField("author", "Wheeler");
> >>
> >> doc1.add(new FacetField("media-type", "magazine"));
> >> doc1.add(new FacetField("topic", "physics", "astro");
> >>
> >> ?
> >>
> >> IIUC the StringField approach is more general, whereas the FacetField
> >> approach allows to do a more specific categorization / search.
> >> Or do I misunderstand this?
> >>
> >> Thanks
> >>
> >> Michael
> >>
> >>
> >>
> >> -
> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
> >>
> >>
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


Re: When to use StringField and when to use FacetField for categorization?

2023-10-20 Thread Michael Wechner

Hi Adrien

Thank you very much for your feedback as well!

I just replaced the StringField by KeywordField :-)

Thanks

Michael

Am 20.10.23 um 14:13 schrieb Adrien Grand:
FYI there is also KeywordField, which combines StringField and 
SortedSetDocValuesField. It supports filtering, sorting, faceting and 
retrieval. It's my go-to field for string values.


Le ven. 20 oct. 2023, 12:20, Michael McCandless 
 a écrit :


There are some differences.

StringField is indexed into the inverted index (postings) so you
can do
efficient filtering.  You can also store in stored fields to retrieve.

FacetField does everything StringField does (filtering, storing
(maybe?)),
but in addition it stores data for faceting.  I.e. you can compute
facet
counts or simple aggregations at search time.

FacetField is also hierarchical: you can filter and facet by different
points/levels of your hierarchy.

Mike McCandless

http://blog.mikemccandless.com


On Fri, Oct 20, 2023 at 5:43 AM Michael Wechner

wrote:

> Hi
>
> I have found the following simple Facet Example
>
>
>

https://github.com/apache/lucene/blob/main/lucene/demo/src/java/org/apache/lucene/demo/facet/SimpleFacetsExample.java
>
> whereas for a simple categorization of documents I currently use
> StringField, e.g.
>
> doc1.add(new StringField("category", "book"));
> doc1.add(new StringField("category", "quantum_physics"));
> doc1.add(new StringField("category", "Neumann"))
> doc1.add(new StringField("category", "Wheeler"))
>
> doc2.add(new StringField("category", "magazine"));
> doc2.add(new StringField("category", "astro_physics"));
>
> which works well, but would it be better to use Facets for this,
e.g.
>
> doc1.add(new FacetField("media-type", "book"));
> doc1.add(new FacetField("topic", "physics", "quantum");
> doc1.add(new FacetField("author", "Neumann");
> doc1.add(new FacetField("author", "Wheeler");
>
> doc1.add(new FacetField("media-type", "magazine"));
> doc1.add(new FacetField("topic", "physics", "astro");
>
> ?
>
> IIUC the StringField approach is more general, whereas the
FacetField
> approach allows to do a more specific categorization / search.
> Or do I misunderstand this?
>
> Thanks
>
> Michael
>
>
>
>
-
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>



Re: When to use StringField and when to use FacetField for categorization?

2023-10-20 Thread Michael Wechner

Hi Mike

Thanks for your feedback!

IIUC in order to have the actual advantages of Facets one has to 
"connect" it with a TaxonomyWriter


FacetsConfig config = new FacetsConfig();
DirectoryTaxonomyWriter taxoWriter = new DirectoryTaxonomyWriter(taxoDir);
indexWriter.addDocument(config.build(taxoWriter, doc));

right?

Thanks

Michael




Am 20.10.23 um 12:19 schrieb Michael McCandless:

There are some differences.

StringField is indexed into the inverted index (postings) so you can do
efficient filtering.  You can also store in stored fields to retrieve.

FacetField does everything StringField does (filtering, storing (maybe?)),
but in addition it stores data for faceting.  I.e. you can compute facet
counts or simple aggregations at search time.

FacetField is also hierarchical: you can filter and facet by different
points/levels of your hierarchy.

Mike McCandless

http://blog.mikemccandless.com


On Fri, Oct 20, 2023 at 5:43 AM Michael Wechner 
wrote:


Hi

I have found the following simple Facet Example


https://github.com/apache/lucene/blob/main/lucene/demo/src/java/org/apache/lucene/demo/facet/SimpleFacetsExample.java

whereas for a simple categorization of documents I currently use
StringField, e.g.

doc1.add(new StringField("category", "book"));
doc1.add(new StringField("category", "quantum_physics"));
doc1.add(new StringField("category", "Neumann"))
doc1.add(new StringField("category", "Wheeler"))

doc2.add(new StringField("category", "magazine"));
doc2.add(new StringField("category", "astro_physics"));

which works well, but would it be better to use Facets for this, e.g.

doc1.add(new FacetField("media-type", "book"));
doc1.add(new FacetField("topic", "physics", "quantum");
doc1.add(new FacetField("author", "Neumann");
doc1.add(new FacetField("author", "Wheeler");

doc1.add(new FacetField("media-type", "magazine"));
doc1.add(new FacetField("topic", "physics", "astro");

?

IIUC the StringField approach is more general, whereas the FacetField
approach allows to do a more specific categorization / search.
Or do I misunderstand this?

Thanks

Michael



-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org





-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



Re: When to use StringField and when to use FacetField for categorization?

2023-10-20 Thread Adrien Grand
FYI there is also KeywordField, which combines StringField and
SortedSetDocValuesField. It supports filtering, sorting, faceting and
retrieval. It's my go-to field for string values.

Le ven. 20 oct. 2023, 12:20, Michael McCandless 
a écrit :

> There are some differences.
>
> StringField is indexed into the inverted index (postings) so you can do
> efficient filtering.  You can also store in stored fields to retrieve.
>
> FacetField does everything StringField does (filtering, storing (maybe?)),
> but in addition it stores data for faceting.  I.e. you can compute facet
> counts or simple aggregations at search time.
>
> FacetField is also hierarchical: you can filter and facet by different
> points/levels of your hierarchy.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Fri, Oct 20, 2023 at 5:43 AM Michael Wechner  >
> wrote:
>
> > Hi
> >
> > I have found the following simple Facet Example
> >
> >
> >
> https://github.com/apache/lucene/blob/main/lucene/demo/src/java/org/apache/lucene/demo/facet/SimpleFacetsExample.java
> >
> > whereas for a simple categorization of documents I currently use
> > StringField, e.g.
> >
> > doc1.add(new StringField("category", "book"));
> > doc1.add(new StringField("category", "quantum_physics"));
> > doc1.add(new StringField("category", "Neumann"))
> > doc1.add(new StringField("category", "Wheeler"))
> >
> > doc2.add(new StringField("category", "magazine"));
> > doc2.add(new StringField("category", "astro_physics"));
> >
> > which works well, but would it be better to use Facets for this, e.g.
> >
> > doc1.add(new FacetField("media-type", "book"));
> > doc1.add(new FacetField("topic", "physics", "quantum");
> > doc1.add(new FacetField("author", "Neumann");
> > doc1.add(new FacetField("author", "Wheeler");
> >
> > doc1.add(new FacetField("media-type", "magazine"));
> > doc1.add(new FacetField("topic", "physics", "astro");
> >
> > ?
> >
> > IIUC the StringField approach is more general, whereas the FacetField
> > approach allows to do a more specific categorization / search.
> > Or do I misunderstand this?
> >
> > Thanks
> >
> > Michael
> >
> >
> >
> > -
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >
>


Re: When to use StringField and when to use FacetField for categorization?

2023-10-20 Thread Michael McCandless
There are some differences.

StringField is indexed into the inverted index (postings) so you can do
efficient filtering.  You can also store in stored fields to retrieve.

FacetField does everything StringField does (filtering, storing (maybe?)),
but in addition it stores data for faceting.  I.e. you can compute facet
counts or simple aggregations at search time.

FacetField is also hierarchical: you can filter and facet by different
points/levels of your hierarchy.

Mike McCandless

http://blog.mikemccandless.com


On Fri, Oct 20, 2023 at 5:43 AM Michael Wechner 
wrote:

> Hi
>
> I have found the following simple Facet Example
>
>
> https://github.com/apache/lucene/blob/main/lucene/demo/src/java/org/apache/lucene/demo/facet/SimpleFacetsExample.java
>
> whereas for a simple categorization of documents I currently use
> StringField, e.g.
>
> doc1.add(new StringField("category", "book"));
> doc1.add(new StringField("category", "quantum_physics"));
> doc1.add(new StringField("category", "Neumann"))
> doc1.add(new StringField("category", "Wheeler"))
>
> doc2.add(new StringField("category", "magazine"));
> doc2.add(new StringField("category", "astro_physics"));
>
> which works well, but would it be better to use Facets for this, e.g.
>
> doc1.add(new FacetField("media-type", "book"));
> doc1.add(new FacetField("topic", "physics", "quantum");
> doc1.add(new FacetField("author", "Neumann");
> doc1.add(new FacetField("author", "Wheeler");
>
> doc1.add(new FacetField("media-type", "magazine"));
> doc1.add(new FacetField("topic", "physics", "astro");
>
> ?
>
> IIUC the StringField approach is more general, whereas the FacetField
> approach allows to do a more specific categorization / search.
> Or do I misunderstand this?
>
> Thanks
>
> Michael
>
>
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


When to use StringField and when to use FacetField for categorization?

2023-10-20 Thread Michael Wechner

Hi

I have found the following simple Facet Example

https://github.com/apache/lucene/blob/main/lucene/demo/src/java/org/apache/lucene/demo/facet/SimpleFacetsExample.java

whereas for a simple categorization of documents I currently use 
StringField, e.g.


doc1.add(new StringField("category", "book"));
doc1.add(new StringField("category", "quantum_physics"));
doc1.add(new StringField("category", "Neumann"))
doc1.add(new StringField("category", "Wheeler"))

doc2.add(new StringField("category", "magazine"));
doc2.add(new StringField("category", "astro_physics"));

which works well, but would it be better to use Facets for this, e.g.

doc1.add(new FacetField("media-type", "book"));
doc1.add(new FacetField("topic", "physics", "quantum");
doc1.add(new FacetField("author", "Neumann");
doc1.add(new FacetField("author", "Wheeler");

doc1.add(new FacetField("media-type", "magazine"));
doc1.add(new FacetField("topic", "physics", "astro");

?

IIUC the StringField approach is more general, whereas the FacetField 
approach allows to do a more specific categorization / search.

Or do I misunderstand this?

Thanks

Michael



-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org