[
https://issues.apache.org/jira/browse/SOLR-16235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joel Bernstein updated SOLR-16235:
----------------------------------
Description:
The latest versions of Streaming Expressions support the drill function which
is designed for high cardinality aggregations
(https://solr.apache.org/guide/8_6/stream-source-reference.html#drill). Drill
allows users to push down a Streaming Expression into the export handler itself
and emit aggregated tuples over the wire. Because drill still takes advantage
of the sort order of the export handler it supports unlimited cardinality. Do
to the massive performance improvements in the export handler in Solr 9, drill
is almost as fast as facets. Drill is not a replacement for facet mode though,
which is still faster in low to medium cardinality situations.
This improvement should be rather easy to implement but there are some
questions about the design. One thought I had was to add a new mode called
*drill* to the existing *map_reduce* and *facet* modes. This would preserve
the existing map_reduce execution plan. The other approach is simply to always
use drill in *map_reduce* mode aggregations.
was:
The latest versions of Streaming Expressions support the drill function which
is designed for high cardinality aggregations
(https://solr.apache.org/guide/8_6/stream-source-reference.html#drill). Drill
allows users to push down a Streaming Expression into the export handler itself
and emit aggregated tuples over the wire. Because drill still takes advantage
of the sort order of the export handler it supports unlimited cardinality.
This improvement should be rather easy to implement but there are some
questions about the design. One thought I had was to add a new mode called
*drill* to the existing *map_reduce* and *facet* modes. This would preserve
the existing map_reduce execution plan. The other approach is simply to always
use drill in *map_reduce* mode aggregations.
> Allow Solr SQL to use the drill Streaming Expression for high cardinality
> aggregations
> --------------------------------------------------------------------------------------
>
> Key: SOLR-16235
> URL: https://issues.apache.org/jira/browse/SOLR-16235
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: Parallel SQL
> Reporter: Joel Bernstein
> Assignee: Joel Bernstein
> Priority: Major
> Labels: RobustSQL
>
> The latest versions of Streaming Expressions support the drill function which
> is designed for high cardinality aggregations
> (https://solr.apache.org/guide/8_6/stream-source-reference.html#drill). Drill
> allows users to push down a Streaming Expression into the export handler
> itself and emit aggregated tuples over the wire. Because drill still takes
> advantage of the sort order of the export handler it supports unlimited
> cardinality. Do to the massive performance improvements in the export handler
> in Solr 9, drill is almost as fast as facets. Drill is not a replacement for
> facet mode though, which is still faster in low to medium cardinality
> situations.
> This improvement should be rather easy to implement but there are some
> questions about the design. One thought I had was to add a new mode called
> *drill* to the existing *map_reduce* and *facet* modes. This would preserve
> the existing map_reduce execution plan. The other approach is simply to
> always use drill in *map_reduce* mode aggregations.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]