[jira] [Commented] (SOLR-752) Allow better Field Compression options
[ https://issues.apache.org/jira/browse/SOLR-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13494042#comment-13494042 ] Pieter commented on SOLR-752: - Doesn't the new Lucene 4.1 field compression (LUCENE-4226 if I am right) tackle this? Allow better Field Compression options -- Key: SOLR-752 URL: https://issues.apache.org/jira/browse/SOLR-752 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Priority: Minor Attachments: compressed_field.patch, compressedtextfield.patch See http://lucene.markmail.org/message/sd4mgwud6caevb35?q=compression It would be good if Solr handled field compression outside of Lucene's Field.COMPRESS capabilities, since those capabilities are less than ideal when it comes to control over compression. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-752) Allow better Field Compression options
[ https://issues.apache.org/jira/browse/SOLR-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13494117#comment-13494117 ] David Smiley commented on SOLR-752: --- LUCENE-4226 basically does this but you can't configure codecs; you pick a codec in its default mode. The Compressing codec defaults to fast and yields ~50% savings based on Adrien's tests of a small to medium sized index: https://issues.apache.org/jira/browse/LUCENE-4226?focusedCommentId=13451708page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13451708 But what I'd like to see is the ability to compress a large text field (alone), for the purposes of highlighting, and much more than 50% compression. It might not be able to handle that many concurrent requests to meet response time SLAs, but some search apps aren't under high load. Allow better Field Compression options -- Key: SOLR-752 URL: https://issues.apache.org/jira/browse/SOLR-752 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Priority: Minor Attachments: compressed_field.patch, compressedtextfield.patch See http://lucene.markmail.org/message/sd4mgwud6caevb35?q=compression It would be good if Solr handled field compression outside of Lucene's Field.COMPRESS capabilities, since those capabilities are less than ideal when it comes to control over compression. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-752) Allow better Field Compression options
[ https://issues.apache.org/jira/browse/SOLR-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13094219#comment-13094219 ] Kim Taylor commented on SOLR-752: - I've had a look into this. Simon is right, DefaultSolrHighlighter uses doc.getValues(fieldName) to retrieve the field. lucene.document.Document.getValues() calls stringValue on the appropriate field. The problem is that when FieldsWriter/Reader read fields from segments, the supplied CompressedField gets converted into a Field, which does not know how to interpret fieldsData. I've added another patch that alters DefaultSolrHighlighter to use the schema FieldType (in this case CompressedField) to properly interpred fieldsData. Allow better Field Compression options -- Key: SOLR-752 URL: https://issues.apache.org/jira/browse/SOLR-752 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Priority: Minor Attachments: compressedtextfield.patch See http://lucene.markmail.org/message/sd4mgwud6caevb35?q=compression It would be good if Solr handled field compression outside of Lucene's Field.COMPRESS capabilities, since those capabilities are less than ideal when it comes to control over compression. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-752) Allow better Field Compression options
[ https://issues.apache.org/jira/browse/SOLR-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12891284#action_12891284 ] David Smiley commented on SOLR-752: --- I spent some time today attempting to implement this with my own Solr FieldType that extends TextField. As I tried to implement it, I realized that I couldn't really do it. FieldType has a method createField(...) that is necessary to implement in order to set binary data (i.e. byte[]) on a Field. This method demands I return a org.apache.lucene.document.Field which is final. If I create the field with binary data, by default it's not indexed or tokenized. I can get those booleans to flip by simply invoking f.setTokenStream(null). However, I can't set omitNorms() to false, nor can I set booleans for the term vector fields. There may be other issues but at this point I gave up to work on other more important priorities of mine. Allow better Field Compression options -- Key: SOLR-752 URL: https://issues.apache.org/jira/browse/SOLR-752 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Priority: Minor See http://lucene.markmail.org/message/sd4mgwud6caevb35?q=compression It would be good if Solr handled field compression outside of Lucene's Field.COMPRESS capabilities, since those capabilities are less than ideal when it comes to control over compression. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (SOLR-752) Allow better Field Compression options
[ https://issues.apache.org/jira/browse/SOLR-752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12891305#action_12891305 ] David Smiley commented on SOLR-752: --- I already looked at BinaryField and TrieField for inspiration. BinaryField assumes you're not going to index the data. And TrieField doesn't set binary data value on the Field. Yes, I think the next step is to make createField() return Fieldable. But I'm not a committer... Instead or in addition... I have to wonder, why not modify Lucene's Field class to allow me to set the Index, Store, and TermVecotr enums AND specify binary data on a suitable constructor? Arguably an existing constructor taking String would be hijaced to take Object and then do the right thing. That would be a small change, whereas implementing another subclass of AbstractField is more complex and would likely reproduce much of what's in Field already. Allow better Field Compression options -- Key: SOLR-752 URL: https://issues.apache.org/jira/browse/SOLR-752 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Priority: Minor See http://lucene.markmail.org/message/sd4mgwud6caevb35?q=compression It would be good if Solr handled field compression outside of Lucene's Field.COMPRESS capabilities, since those capabilities are less than ideal when it comes to control over compression. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org