[jira] Created: (SOLR-1583) Create DataSources that return InputStream
Create DataSources that return InputStream -- Key: SOLR-1583 URL: https://issues.apache.org/jira/browse/SOLR-1583 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Noble Paul Priority: Minor Tika integration means the source has to be binary that is the DataSource must be of type DataSourceInputStream . All the DataSourceReader should have a binary counterpart. * BinURLDataSourceInputStream * BinContentStreamDataSourceInputStream * BinFileDataOurceInputStream -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1584) setIncludeScore is added to the FL field instead of being concated
setIncludeScore is added to the FL field instead of being concated Key: SOLR-1584 URL: https://issues.apache.org/jira/browse/SOLR-1584 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 1.4 Reporter: Asaf Ary Priority: Minor The current implementation of setIncludeScore(boolean) *adds* the value score to the FL parameter. This causes a problem when using the setFields followed by include score. If I do this: setFields(*); setIncludeScore(true); I would expect the outcome to be fl=*,score Instead the outcome is: fl=* fl=score which fails to use the score field as FL is not a multi-valued field. The current implementation in the SolrJ SolrQuery object is: add(fl, score) instead it should be: set(fl, get(fl) + ,score) obviously not as simplistic as that, but you catch my drift... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (SOLR-1583) Create DataSources that return InputStream
[ https://issues.apache.org/jira/browse/SOLR-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul reassigned SOLR-1583: Assignee: Noble Paul Create DataSources that return InputStream -- Key: SOLR-1583 URL: https://issues.apache.org/jira/browse/SOLR-1583 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Noble Paul Assignee: Noble Paul Priority: Minor Tika integration means the source has to be binary that is the DataSource must be of type DataSourceInputStream . All the DataSourceReader should have a binary counterpart. * BinURLDataSourceInputStream * BinContentStreamDataSourceInputStream * BinFileDataOurceInputStream -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1583) Create DataSources that return InputStream
[ https://issues.apache.org/jira/browse/SOLR-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-1583: - Attachment: SOLR-1583.patch Create DataSources that return InputStream -- Key: SOLR-1583 URL: https://issues.apache.org/jira/browse/SOLR-1583 Project: Solr Issue Type: New Feature Components: contrib - DataImportHandler Reporter: Noble Paul Assignee: Noble Paul Priority: Minor Attachments: SOLR-1583.patch Tika integration means the source has to be binary that is the DataSource must be of type DataSourceInputStream . All the DataSourceReader should have a binary counterpart. * BinURLDataSourceInputStream * BinContentStreamDataSourceInputStream * BinFileDataOurceInputStream -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1131) Allow a single field type to index multiple fields
[ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated SOLR-1131: -- Summary: Allow a single field type to index multiple fields (was: Allow a single field to index multiple fields) Allow a single field type to index multiple fields -- Key: SOLR-1131 URL: https://issues.apache.org/jira/browse/SOLR-1131 Project: Solr Issue Type: New Feature Components: Analysis Reporter: Ryan McKinley Assignee: Grant Ingersoll Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch In a few special cases, it makes sense for a single field (the concept) to be indexed as a set of Fields (lucene Field). Consider SOLR-773. The concept point may be best indexed in a variety of ways: * geohash (sincle lucene field) * lat field, lon field (two double fields) * cartesian tiers (a series of fields with tokens to say if it exists within that region) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields
[ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12780950#action_12780950 ] Grant Ingersoll commented on SOLR-1131: --- bq. Is this a good idea? Not sure yet. bq. Why don't we add a new interface MutlValuedFieldType which extends FieldType for this Aren't we just substituting a very simple construction for an instanceof check? I was possibly thinking of a couple of other options, too: 1. add a boolean on FT for isMultiField which returns false by default, then we could check that 2. Add a threadlocal that stores a preconstructed array of size one which could then simply be set for the single field case, which is the most common case. My gut, however, says the object is very short lived and is likely to be of negligible cost. Allow a single field type to index multiple fields -- Key: SOLR-1131 URL: https://issues.apache.org/jira/browse/SOLR-1131 Project: Solr Issue Type: New Feature Components: Analysis Reporter: Ryan McKinley Assignee: Grant Ingersoll Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch In a few special cases, it makes sense for a single field (the concept) to be indexed as a set of Fields (lucene Field). Consider SOLR-773. The concept point may be best indexed in a variety of ways: * geohash (sincle lucene field) * lat field, lon field (two double fields) * cartesian tiers (a series of fields with tokens to say if it exists within that region) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1585) Refactor shards handling out of QueryComponent and into ShardsComponent
Refactor shards handling out of QueryComponent and into ShardsComponent --- Key: SOLR-1585 URL: https://issues.apache.org/jira/browse/SOLR-1585 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Assignee: Grant Ingersoll Fix For: 1.5 Per the TODOs in QueryComponent, create a ShardsComponent that handles setting up the shards. Additionally, make it so that it can handle smaller parameters, too. For instance, it is likely the case that in most setups only the IP address is changed, so we could have intelligent defaults which will make for shorter query strings. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1586) Create Spatial Point FieldTypes
Create Spatial Point FieldTypes --- Key: SOLR-1586 URL: https://issues.apache.org/jira/browse/SOLR-1586 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 1.5 Per SOLR-773, create field types that hid the details of creating tiers, geohash and lat/lon fields. Fields should take in lat/lon points in a single form, as in: field name=foolat lon/field -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields
[ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12780953#action_12780953 ] Noble Paul commented on SOLR-1131: -- bq.dd a boolean on FT for isMultiField which returns false by default, then we could check that not bad bq.My gut, however, says the object is very short lived and is likely to be of negligible cost. but, for a huge ingestion it just means several million objects created and that much extra GC Allow a single field type to index multiple fields -- Key: SOLR-1131 URL: https://issues.apache.org/jira/browse/SOLR-1131 Project: Solr Issue Type: New Feature Components: Analysis Reporter: Ryan McKinley Assignee: Grant Ingersoll Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch In a few special cases, it makes sense for a single field (the concept) to be indexed as a set of Fields (lucene Field). Consider SOLR-773. The concept point may be best indexed in a variety of ways: * geohash (sincle lucene field) * lat field, lon field (two double fields) * cartesian tiers (a series of fields with tokens to say if it exists within that region) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields
[ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12780954#action_12780954 ] Grant Ingersoll commented on SOLR-1131: --- I'm also looking for ideas on how to handle the naming of the fields that are produced by this. I think a FieldType that produces multiple fields should hide the logistics of the naming, which this patch doesn't even begin to scratch the surface of and also on the search side, how does one search against just one of the fields? Would appreciated thoughts on that. Allow a single field type to index multiple fields -- Key: SOLR-1131 URL: https://issues.apache.org/jira/browse/SOLR-1131 Project: Solr Issue Type: New Feature Components: Analysis Reporter: Ryan McKinley Assignee: Grant Ingersoll Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch In a few special cases, it makes sense for a single field (the concept) to be indexed as a set of Fields (lucene Field). Consider SOLR-773. The concept point may be best indexed in a variety of ways: * geohash (sincle lucene field) * lat field, lon field (two double fields) * cartesian tiers (a series of fields with tokens to say if it exists within that region) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-1131) Allow a single field type to index multiple fields
[ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12780954#action_12780954 ] Grant Ingersoll edited comment on SOLR-1131 at 11/21/09 12:11 PM: -- I'm also looking for ideas on how to handle the naming of the fields that are produced by this. I think a FieldType that produces multiple fields should hide the logistics of the naming, which this patch doesn't even begin to scratch the surface of and also on the search side, how does one search against just one of the fields? Would appreciate thoughts on that. was (Author: gsingers): I'm also looking for ideas on how to handle the naming of the fields that are produced by this. I think a FieldType that produces multiple fields should hide the logistics of the naming, which this patch doesn't even begin to scratch the surface of and also on the search side, how does one search against just one of the fields? Would appreciated thoughts on that. Allow a single field type to index multiple fields -- Key: SOLR-1131 URL: https://issues.apache.org/jira/browse/SOLR-1131 Project: Solr Issue Type: New Feature Components: Analysis Reporter: Ryan McKinley Assignee: Grant Ingersoll Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch In a few special cases, it makes sense for a single field (the concept) to be indexed as a set of Fields (lucene Field). Consider SOLR-773. The concept point may be best indexed in a variety of ways: * geohash (sincle lucene field) * lat field, lon field (two double fields) * cartesian tiers (a series of fields with tokens to say if it exists within that region) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields
[ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12780955#action_12780955 ] Chris Male commented on SOLR-1131: -- My initial feeling is, is searching against just one field something this functionality needs to concern itself with? If someone creates a field of type Point for example, which behind the scenes is indexed as 2 fields, from a Solr schema.xml perspective it is just 1 field, and so it should be the same at the querying level. We are trying to encapsulate the fact that the FieldType results in multiple fields. This then frees us up to choose a naming convention that is easy for us to implement, because we don't have to concern users with the convention. If someone does want to be able to search against just one field, such as maybe being able to find documents at a certain x coordinate, rather than an x,y Point, then I think we can simply recommend they index that data in a separate field. Allow a single field type to index multiple fields -- Key: SOLR-1131 URL: https://issues.apache.org/jira/browse/SOLR-1131 Project: Solr Issue Type: New Feature Components: Analysis Reporter: Ryan McKinley Assignee: Grant Ingersoll Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch In a few special cases, it makes sense for a single field (the concept) to be indexed as a set of Fields (lucene Field). Consider SOLR-773. The concept point may be best indexed in a variety of ways: * geohash (sincle lucene field) * lat field, lon field (two double fields) * cartesian tiers (a series of fields with tokens to say if it exists within that region) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields
[ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12780956#action_12780956 ] Grant Ingersoll commented on SOLR-1131: --- I definitely agree, Chris, the interesting part is how that manifests itself in terms of implementation, which is where I am digging in at the moment. It means the Query parsers need to handle it as well as the ResponseWriters, etc. Allow a single field type to index multiple fields -- Key: SOLR-1131 URL: https://issues.apache.org/jira/browse/SOLR-1131 Project: Solr Issue Type: New Feature Components: Analysis Reporter: Ryan McKinley Assignee: Grant Ingersoll Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch In a few special cases, it makes sense for a single field (the concept) to be indexed as a set of Fields (lucene Field). Consider SOLR-773. The concept point may be best indexed in a variety of ways: * geohash (sincle lucene field) * lat field, lon field (two double fields) * cartesian tiers (a series of fields with tokens to say if it exists within that region) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1131) Allow a single field type to index multiple fields
[ https://issues.apache.org/jira/browse/SOLR-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12780961#action_12780961 ] Chris Male commented on SOLR-1131: -- Those are definitely big problems. The ResponseWriter problem could be simplified if they used SolrDocuments rather than retrieving raw Lucene Documents. When constructing the SolrDocuments, which is done in cooperation with an IndexSchema instance, we have the information needed to bring the multiple fields together as one. I'm not sure of the performance impact of doing this, but it seems like having the ResponseWriters retrieve the data in a single consistent fashion is a good thing in the long run anyway. Allow a single field type to index multiple fields -- Key: SOLR-1131 URL: https://issues.apache.org/jira/browse/SOLR-1131 Project: Solr Issue Type: New Feature Components: Analysis Reporter: Ryan McKinley Assignee: Grant Ingersoll Attachments: SOLR-1131-IndexMultipleFields.patch, SOLR-1131.patch In a few special cases, it makes sense for a single field (the concept) to be indexed as a set of Fields (lucene Field). Consider SOLR-773. The concept point may be best indexed in a variety of ways: * geohash (sincle lucene field) * lat field, lon field (two double fields) * cartesian tiers (a series of fields with tokens to say if it exists within that region) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1587) Propagating fl=*,score to shards
Propagating fl=*,score to shards Key: SOLR-1587 URL: https://issues.apache.org/jira/browse/SOLR-1587 Project: Solr Issue Type: Bug Components: search Affects Versions: 1.4 Environment: any http solr server Reporter: Asaf Ary When doing an HTTP request to a Solr Server using the shards parameter (_shards_) the behavior of the response varies. The following requests cause the entire document (all fields) to return in the response: {quote} http://localhost:8180/solr/cpaCore/select/?q=*:* http://localhost:8180/solr/cpaCore/select/?q=*:*fl=score http://localhost:8180/solr/cpaCore/select/?q=*:*shards=shardLocation/solr/cpaCore {quote} The following request causes only the fields id and score to return in the response: {quote} http://localhost:8180/solr/cpaCore/select/?q=*:*fl=scoreshards=shardLocation/solr/cpaCore {quote} I don't know if this is by design but it does provide for some inconsistent behavior, as shard requests behave differently than regular requests. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1571) unicode collation support
[ https://issues.apache.org/jira/browse/SOLR-1571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12781053#action_12781053 ] Robert Muir commented on SOLR-1571: --- Hi, i wonder if anyone has any comments on this. I know this is an invisible/convert JIRA issue right now :) especially I am curious if the approach is sound, particularly regarding using the ICUCollationFilter instead. In my opinion, this should be a separate integration, even though it will index at a significantly faster speed with much smaller keys. The reason is that it is not compat with the JDK collation keys, and has different properties, such as the fact Collator is thread-safe in the JDK, but not thread-safe in ICU. Because of this, I decided to stick with the JDK impl initially. unicode collation support - Key: SOLR-1571 URL: https://issues.apache.org/jira/browse/SOLR-1571 Project: Solr Issue Type: New Feature Components: Analysis Reporter: Robert Muir Priority: Minor Attachments: SOLR-1571.patch This patch adds support for unicode collation (searching and sorting). Unicode collation is helpful in a search engine, for many languages you want things to match or sort differently. You might even want to use copyfield and support different sort orders/matching schemes if you need to support multiple languages. This is simply a factory for lucene's CollationKeyFilter, which indexes binary collation keys in a special format that preserves binary sort order. I've added support for creating a Collator in two ways: * system collator from a Locale spec (language + country + variant) * tailored collator from custom rules in a text file in no way is there an option to use the default locale of the jvm, (I consider this a bit dangerous) in this patch, it is mandatory to define the locale explicitly for a system collator. The required lucene-collation-2.9.1.jar is only 12KB. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (SOLR-1571) unicode collation support
[ https://issues.apache.org/jira/browse/SOLR-1571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12781053#action_12781053 ] Robert Muir edited comment on SOLR-1571 at 11/21/09 9:13 PM: - Hi, i wonder if anyone has any comments on this. I know this is an invisible/covert JIRA issue right now :) especially I am curious if the approach is sound, particularly regarding using the ICUCollationFilter instead. In my opinion, this should be a separate integration, even though it will index at a significantly faster speed with much smaller keys. The reason is that it is not compat with the JDK collation keys, and has different properties, such as the fact Collator is thread-safe in the JDK, but not thread-safe in ICU. Because of this, I decided to stick with the JDK impl initially. was (Author: rcmuir): Hi, i wonder if anyone has any comments on this. I know this is an invisible/convert JIRA issue right now :) especially I am curious if the approach is sound, particularly regarding using the ICUCollationFilter instead. In my opinion, this should be a separate integration, even though it will index at a significantly faster speed with much smaller keys. The reason is that it is not compat with the JDK collation keys, and has different properties, such as the fact Collator is thread-safe in the JDK, but not thread-safe in ICU. Because of this, I decided to stick with the JDK impl initially. unicode collation support - Key: SOLR-1571 URL: https://issues.apache.org/jira/browse/SOLR-1571 Project: Solr Issue Type: New Feature Components: Analysis Reporter: Robert Muir Priority: Minor Attachments: SOLR-1571.patch This patch adds support for unicode collation (searching and sorting). Unicode collation is helpful in a search engine, for many languages you want things to match or sort differently. You might even want to use copyfield and support different sort orders/matching schemes if you need to support multiple languages. This is simply a factory for lucene's CollationKeyFilter, which indexes binary collation keys in a special format that preserves binary sort order. I've added support for creating a Collator in two ways: * system collator from a Locale spec (language + country + variant) * tailored collator from custom rules in a text file in no way is there an option to use the default locale of the jvm, (I consider this a bit dangerous) in this patch, it is mandatory to define the locale explicitly for a system collator. The required lucene-collation-2.9.1.jar is only 12KB. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-773) Incorporate Local Lucene/Solr
[ https://issues.apache.org/jira/browse/SOLR-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12781094#action_12781094 ] patrick o'leary commented on SOLR-773: -- 11/21/09 21:00 PDT patrick o'leary to locallucene-users, locallucene-developers Folks I've updated localsolr to work with solr-1.4 release, also works with solr-1.5?? nightly as of 11/21/09 There are a couple of changes needed to upgrade to this version. 1) schema.xml has to be updated lat / long fields and dynamic field _localTier* has to be updated to type=tdouble 2) your index has to be rebuilt from scratch. This is not ideal, but unfortunately numeric util updates in lucene force us down this path. As always I've put a batteries included demo on http://www.nsshutdown.com/solr-example.tgz Thanks Patrick Incorporate Local Lucene/Solr - Key: SOLR-773 URL: https://issues.apache.org/jira/browse/SOLR-773 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 1.5 Attachments: exampleSpatial.zip, lucene-spatial-2.9-dev.jar, lucene.tar.gz, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, SOLR-773-local-lucene.patch, SOLR-773-spatial_solr.patch, SOLR-773.patch, SOLR-773.patch, solrGeoQuery.tar, spatial-solr.tar.gz Local Lucene has been donated to the Lucene project. It has some Solr components, but we should evaluate how best to incorporate it into Solr. See http://lucene.markmail.org/message/orzro22sqdj3wows?q=LocalLucene -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.