Re: [custom data structure] aligned dynamic fields
Jack, Thanks for your response. 1. Flattening could be an option, although our scale and required functionality (runtime non DocValues backed facets) is beyond what solr3 can handle (billions of docs). We have flattened the meta data at the expense of over-generating solr documents. But to solve the problem I have described via flattening would make big impact on the scalability and price. 2. We have quite the opposite of what you have described about the dynamic fields: there will be very few per document. I agree, that caution should be taken here, as we have suffered (or should I say experienced) having multivalued fields (the good thing is we never had to facet on them). Any other options? Maybe someone can share their experience with dynamic fields and discourage from pursuing this path? Dmitry On Mon, May 20, 2013 at 4:23 PM, Jack Krupansky j...@basetechnology.comwrote: Before you dive off the deep end and go crazy with dynamic fields, try a clean, simple, Solr-oriented static design. Yes, you CAN do an over-complicated design with dynamic fields, but that doesn't mean you should. In a single phrase, denormalize and flatten your design. Sure, that will lead to a lot of rows, but Solr and Lucene are designed to do well in that scenario. If you are still linking in terms of C Struct, go for a long walk or do SOMETHING else until you can get that idea out of your head. It is a sub-optimal approach for exploiting the power of Lucene and Solr. Stay with a static schema design until you hit... just stay with a static schema, period. Dynamic fields and multi-valued fields do have value, but only when used in moderation - small numbers. If you start down a design path and find that you are heavily dependent on dynamic fields and/or multi-valued fields with large numbers of values per document, that is feedback that your design needs to be denormalized and flattened further. -- Jack Krupansky -Original Message- From: Dmitry Kan Sent: Monday, May 20, 2013 7:06 AM To: solr-user@lucene.apache.org Subject: [custom data structure] aligned dynamic fields Hi all, Our current project requirement suggests that we should start storing custom data structures in solr index. The custom data structure would be an equivalent of C struct. The task is as follows. Suppose we have two types of fields, one is FieldName1 and the other FieldName2. Suppose also that we can have multiple pairs of these two fields on a document in Solr. That is, in notation of dynamic fields: doc1 FieldName1_id1 FieldName2_id1 FieldName1_id2 FieldName2_id2 doc2 FieldName1_id3 FieldName2_id3 FieldName1_id4 FieldName2_id4 FieldName1_id5 FieldName2_id5 etc What we would like to have is a value for the Field1_(some_unique_id) and a value for Field2_(some_unique_id) as input for search. That is we wouldn't care about the some_unique_id in some search scenarios. And the search would automatically iterate the pairs of dynamic fields and respect the pairings. I know it used to be so, that with dynamic fields a client must provide the dynamically generated field names coupled with their values up front when searching. What data structure / solution could be used as an alternative approach to help such a structured search? Thanks, Dmitry
Re: [custom data structure] aligned dynamic fields
Although we are entering the era of Big Data, that does not mean there are no limits or restrictions on what a given technology can do. Maybe you need to consider either a smaller scope for your project, or more limited features, or some other form of simplification. Solr can do billions of documents - for a heavily sharded cluster, but you will have to work really hard to make that work well. So, I can confirm, that maybe in this case, there is no free lunch - unless you are willing to strip down the project. Or, maybe we just need a deeper feel for what your data model is really trying to achieve. Suggestion: Think about your data model again, and then try rephrasing it for this group. You have violated one cardinal rule of this group: you focused on a proposed solution rather than focusing our attention on the original problem you are trying to solve. That short-circuited our focus on really solving your problem. -- Jack Krupansky -Original Message- From: Dmitry Kan Sent: Wednesday, May 22, 2013 6:50 AM To: solr-user@lucene.apache.org Subject: Re: [custom data structure] aligned dynamic fields Jack, Thanks for your response. 1. Flattening could be an option, although our scale and required functionality (runtime non DocValues backed facets) is beyond what solr3 can handle (billions of docs). We have flattened the meta data at the expense of over-generating solr documents. But to solve the problem I have described via flattening would make big impact on the scalability and price. 2. We have quite the opposite of what you have described about the dynamic fields: there will be very few per document. I agree, that caution should be taken here, as we have suffered (or should I say experienced) having multivalued fields (the good thing is we never had to facet on them). Any other options? Maybe someone can share their experience with dynamic fields and discourage from pursuing this path? Dmitry On Mon, May 20, 2013 at 4:23 PM, Jack Krupansky j...@basetechnology.comwrote: Before you dive off the deep end and go crazy with dynamic fields, try a clean, simple, Solr-oriented static design. Yes, you CAN do an over-complicated design with dynamic fields, but that doesn't mean you should. In a single phrase, denormalize and flatten your design. Sure, that will lead to a lot of rows, but Solr and Lucene are designed to do well in that scenario. If you are still linking in terms of C Struct, go for a long walk or do SOMETHING else until you can get that idea out of your head. It is a sub-optimal approach for exploiting the power of Lucene and Solr. Stay with a static schema design until you hit... just stay with a static schema, period. Dynamic fields and multi-valued fields do have value, but only when used in moderation - small numbers. If you start down a design path and find that you are heavily dependent on dynamic fields and/or multi-valued fields with large numbers of values per document, that is feedback that your design needs to be denormalized and flattened further. -- Jack Krupansky -Original Message- From: Dmitry Kan Sent: Monday, May 20, 2013 7:06 AM To: solr-user@lucene.apache.org Subject: [custom data structure] aligned dynamic fields Hi all, Our current project requirement suggests that we should start storing custom data structures in solr index. The custom data structure would be an equivalent of C struct. The task is as follows. Suppose we have two types of fields, one is FieldName1 and the other FieldName2. Suppose also that we can have multiple pairs of these two fields on a document in Solr. That is, in notation of dynamic fields: doc1 FieldName1_id1 FieldName2_id1 FieldName1_id2 FieldName2_id2 doc2 FieldName1_id3 FieldName2_id3 FieldName1_id4 FieldName2_id4 FieldName1_id5 FieldName2_id5 etc What we would like to have is a value for the Field1_(some_unique_id) and a value for Field2_(some_unique_id) as input for search. That is we wouldn't care about the some_unique_id in some search scenarios. And the search would automatically iterate the pairs of dynamic fields and respect the pairings. I know it used to be so, that with dynamic fields a client must provide the dynamically generated field names coupled with their values up front when searching. What data structure / solution could be used as an alternative approach to help such a structured search? Thanks, Dmitry
[custom data structure] aligned dynamic fields
Hi all, Our current project requirement suggests that we should start storing custom data structures in solr index. The custom data structure would be an equivalent of C struct. The task is as follows. Suppose we have two types of fields, one is FieldName1 and the other FieldName2. Suppose also that we can have multiple pairs of these two fields on a document in Solr. That is, in notation of dynamic fields: doc1 FieldName1_id1 FieldName2_id1 FieldName1_id2 FieldName2_id2 doc2 FieldName1_id3 FieldName2_id3 FieldName1_id4 FieldName2_id4 FieldName1_id5 FieldName2_id5 etc What we would like to have is a value for the Field1_(some_unique_id) and a value for Field2_(some_unique_id) as input for search. That is we wouldn't care about the some_unique_id in some search scenarios. And the search would automatically iterate the pairs of dynamic fields and respect the pairings. I know it used to be so, that with dynamic fields a client must provide the dynamically generated field names coupled with their values up front when searching. What data structure / solution could be used as an alternative approach to help such a structured search? Thanks, Dmitry
Re: [custom data structure] aligned dynamic fields
Before you dive off the deep end and go crazy with dynamic fields, try a clean, simple, Solr-oriented static design. Yes, you CAN do an over-complicated design with dynamic fields, but that doesn't mean you should. In a single phrase, denormalize and flatten your design. Sure, that will lead to a lot of rows, but Solr and Lucene are designed to do well in that scenario. If you are still linking in terms of C Struct, go for a long walk or do SOMETHING else until you can get that idea out of your head. It is a sub-optimal approach for exploiting the power of Lucene and Solr. Stay with a static schema design until you hit... just stay with a static schema, period. Dynamic fields and multi-valued fields do have value, but only when used in moderation - small numbers. If you start down a design path and find that you are heavily dependent on dynamic fields and/or multi-valued fields with large numbers of values per document, that is feedback that your design needs to be denormalized and flattened further. -- Jack Krupansky -Original Message- From: Dmitry Kan Sent: Monday, May 20, 2013 7:06 AM To: solr-user@lucene.apache.org Subject: [custom data structure] aligned dynamic fields Hi all, Our current project requirement suggests that we should start storing custom data structures in solr index. The custom data structure would be an equivalent of C struct. The task is as follows. Suppose we have two types of fields, one is FieldName1 and the other FieldName2. Suppose also that we can have multiple pairs of these two fields on a document in Solr. That is, in notation of dynamic fields: doc1 FieldName1_id1 FieldName2_id1 FieldName1_id2 FieldName2_id2 doc2 FieldName1_id3 FieldName2_id3 FieldName1_id4 FieldName2_id4 FieldName1_id5 FieldName2_id5 etc What we would like to have is a value for the Field1_(some_unique_id) and a value for Field2_(some_unique_id) as input for search. That is we wouldn't care about the some_unique_id in some search scenarios. And the search would automatically iterate the pairs of dynamic fields and respect the pairings. I know it used to be so, that with dynamic fields a client must provide the dynamically generated field names coupled with their values up front when searching. What data structure / solution could be used as an alternative approach to help such a structured search? Thanks, Dmitry