[ 
https://issues.apache.org/jira/browse/SOLR-8590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher updated SOLR-8590:
-------------------------------
    Attachment: SOLR-8590.patch

This patch fixes the email_ss and url_ss field names, hardens the update script 
so "content" isn't required, and sets a fallback language and increase the 
threshold on language detection.

> example/files improvements
> --------------------------
>
>                 Key: SOLR-8590
>                 URL: https://issues.apache.org/jira/browse/SOLR-8590
>             Project: Solr
>          Issue Type: Bug
>          Components: examples
>            Reporter: Erik Hatcher
>            Assignee: Erik Hatcher
>            Priority: Minor
>             Fix For: 6.0
>
>         Attachments: SOLR-8590.patch
>
>
> There are several example/files improvements/fixes that are warranted:
> * Fix e-mail and URL field names ({{<email>_ss}} and {{<url>_ss}}, with angle 
> brackets in field names), also add display of these fields in /browse results 
> rendering
> * Improve quality of extracted phrases
> * Extract, facet, and display acronyms
> * Add sorting controls, possibly all or some of these: last modified date, 
> created date, relevancy, and title
> * Add grouping by doc_type perhaps
> * fix debug mode - currently does not update the parsed query debug output 
> (this is probably a bug in data driven /browse as well)
> * Harden update-script: it currently errors if documents do not have a 
> "content" field (eg indexing basic CSV), but should instead skip extraction 
> of e-mail addresses and URLs when no "content".  Not quite the use case (no 
> "content") for example/files, but no reason to error in the update script at 
> least.
> * Filter out bogus e-mail addresses.  I'm seeing {{email_ss = 
> "?@[^],\,/^@[$_a-z]"}} for some documents (using Solr docs/ directory as the 
> dataset)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to