Re: [custom data structure] aligned dynamic fields

2013-05-22 Thread Dmitry Kan
Jack,

Thanks for your response.

1. Flattening could be an option, although our scale and required
functionality (runtime non DocValues backed facets) is beyond what solr3
can handle (billions of docs). We have flattened the meta data at the
expense of over-generating solr documents. But to solve the problem I
have described via flattening would make big impact on the scalability and
price.

2. We have quite the opposite of what you have described about the dynamic
fields: there will be very few per document. I agree, that caution should
be taken here, as we have suffered (or should I say experienced) having
multivalued fields (the good thing is we never had to facet on them).

Any other options? Maybe someone can share their experience with dynamic
fields and discourage from pursuing this path?

Dmitry


On Mon, May 20, 2013 at 4:23 PM, Jack Krupansky j...@basetechnology.comwrote:

 Before you dive off the deep end and go crazy with dynamic fields, try a
 clean, simple, Solr-oriented static design. Yes, you CAN do an
 over-complicated design with dynamic fields, but that doesn't mean you
 should.

 In a single phrase, denormalize and flatten your design. Sure, that will
 lead to a lot of rows, but Solr and Lucene are designed to do well in that
 scenario.

 If you are still linking in terms of C Struct, go for a long walk or do
 SOMETHING else until you can get that idea out of your head. It is a
 sub-optimal approach for exploiting the power of Lucene and Solr.

 Stay with a static schema design until you hit... just stay with a static
 schema, period.

 Dynamic fields and multi-valued fields do have value, but only when used
 in moderation - small numbers. If you start down a design path and find
 that you are heavily dependent on dynamic fields and/or multi-valued fields
 with large numbers of values per document, that is feedback that your
 design needs to be denormalized and flattened further.

 -- Jack Krupansky

 -Original Message- From: Dmitry Kan
 Sent: Monday, May 20, 2013 7:06 AM
 To: solr-user@lucene.apache.org
 Subject: [custom data structure] aligned dynamic fields


 Hi all,

 Our current project requirement suggests that we should start storing
 custom data structures in solr index. The custom data structure would be an
 equivalent of C struct.

 The task is as follows.

 Suppose we have two types of fields, one is FieldName1 and the other
 FieldName2.

 Suppose also that we can have multiple pairs of these two fields on a
 document in Solr.

 That is, in notation of dynamic fields:

 doc1
 FieldName1_id1
 FieldName2_id1

 FieldName1_id2
 FieldName2_id2

 doc2
 FieldName1_id3
 FieldName2_id3

 FieldName1_id4
 FieldName2_id4

 FieldName1_id5
 FieldName2_id5

 etc

 What we would like to have is a value for the Field1_(some_unique_id) and a
 value for Field2_(some_unique_id) as input for search. That is we wouldn't
 care about the some_unique_id in some search scenarios. And the search
 would automatically iterate the pairs of dynamic fields and respect the
 pairings.

 I know it used to be so, that with dynamic fields a client must provide the
 dynamically generated field names coupled with their values up front when
 searching.

 What data structure / solution could be used as an alternative approach to
 help such a structured search?

 Thanks,

 Dmitry



Re: [custom data structure] aligned dynamic fields

2013-05-22 Thread Jack Krupansky
Although we are entering the era of Big Data, that does not mean there are 
no limits or restrictions on what a given technology can do.


Maybe you need to consider either a smaller scope for your project, or more 
limited features, or some other form of simplification.


Solr can do billions of documents - for a heavily sharded cluster, but you 
will have to work really hard to make that work well.


So, I can confirm, that maybe in this case, there is no free lunch - unless 
you are willing to strip down the project. Or, maybe we just need a deeper 
feel for what your data model is really trying to achieve.


Suggestion: Think about your data model again, and then try rephrasing it 
for this group. You have violated one cardinal rule of this group: you 
focused on a proposed solution rather than focusing our attention on the 
original problem you are trying to solve. That short-circuited our focus on 
really solving your problem.


-- Jack Krupansky

-Original Message- 
From: Dmitry Kan

Sent: Wednesday, May 22, 2013 6:50 AM
To: solr-user@lucene.apache.org
Subject: Re: [custom data structure] aligned dynamic fields

Jack,

Thanks for your response.

1. Flattening could be an option, although our scale and required
functionality (runtime non DocValues backed facets) is beyond what solr3
can handle (billions of docs). We have flattened the meta data at the
expense of over-generating solr documents. But to solve the problem I
have described via flattening would make big impact on the scalability and
price.

2. We have quite the opposite of what you have described about the dynamic
fields: there will be very few per document. I agree, that caution should
be taken here, as we have suffered (or should I say experienced) having
multivalued fields (the good thing is we never had to facet on them).

Any other options? Maybe someone can share their experience with dynamic
fields and discourage from pursuing this path?

Dmitry


On Mon, May 20, 2013 at 4:23 PM, Jack Krupansky 
j...@basetechnology.comwrote:



Before you dive off the deep end and go crazy with dynamic fields, try a
clean, simple, Solr-oriented static design. Yes, you CAN do an
over-complicated design with dynamic fields, but that doesn't mean you
should.

In a single phrase, denormalize and flatten your design. Sure, that will
lead to a lot of rows, but Solr and Lucene are designed to do well in that
scenario.

If you are still linking in terms of C Struct, go for a long walk or do
SOMETHING else until you can get that idea out of your head. It is a
sub-optimal approach for exploiting the power of Lucene and Solr.

Stay with a static schema design until you hit... just stay with a static
schema, period.

Dynamic fields and multi-valued fields do have value, but only when used
in moderation - small numbers. If you start down a design path and find
that you are heavily dependent on dynamic fields and/or multi-valued 
fields

with large numbers of values per document, that is feedback that your
design needs to be denormalized and flattened further.

-- Jack Krupansky

-Original Message- From: Dmitry Kan
Sent: Monday, May 20, 2013 7:06 AM
To: solr-user@lucene.apache.org
Subject: [custom data structure] aligned dynamic fields


Hi all,

Our current project requirement suggests that we should start storing
custom data structures in solr index. The custom data structure would be 
an

equivalent of C struct.

The task is as follows.

Suppose we have two types of fields, one is FieldName1 and the other
FieldName2.

Suppose also that we can have multiple pairs of these two fields on a
document in Solr.

That is, in notation of dynamic fields:

doc1
FieldName1_id1
FieldName2_id1

FieldName1_id2
FieldName2_id2

doc2
FieldName1_id3
FieldName2_id3

FieldName1_id4
FieldName2_id4

FieldName1_id5
FieldName2_id5

etc

What we would like to have is a value for the Field1_(some_unique_id) and 
a

value for Field2_(some_unique_id) as input for search. That is we wouldn't
care about the some_unique_id in some search scenarios. And the search
would automatically iterate the pairs of dynamic fields and respect the
pairings.

I know it used to be so, that with dynamic fields a client must provide 
the

dynamically generated field names coupled with their values up front when
searching.

What data structure / solution could be used as an alternative approach to
help such a structured search?

Thanks,

Dmitry





[custom data structure] aligned dynamic fields

2013-05-20 Thread Dmitry Kan
Hi all,

Our current project requirement suggests that we should start storing
custom data structures in solr index. The custom data structure would be an
equivalent of C struct.

The task is as follows.

Suppose we have two types of fields, one is FieldName1 and the other
FieldName2.

Suppose also that we can have multiple pairs of these two fields on a
document in Solr.

That is, in notation of dynamic fields:

doc1
FieldName1_id1
FieldName2_id1

FieldName1_id2
FieldName2_id2

doc2
FieldName1_id3
FieldName2_id3

FieldName1_id4
FieldName2_id4

FieldName1_id5
FieldName2_id5

etc

What we would like to have is a value for the Field1_(some_unique_id) and a
value for Field2_(some_unique_id) as input for search. That is we wouldn't
care about the some_unique_id in some search scenarios. And the search
would automatically iterate the pairs of dynamic fields and respect the
pairings.

I know it used to be so, that with dynamic fields a client must provide the
dynamically generated field names coupled with their values up front when
searching.

What data structure / solution could be used as an alternative approach to
help such a structured search?

Thanks,

Dmitry


Re: [custom data structure] aligned dynamic fields

2013-05-20 Thread Jack Krupansky
Before you dive off the deep end and go crazy with dynamic fields, try a 
clean, simple, Solr-oriented static design. Yes, you CAN do an 
over-complicated design with dynamic fields, but that doesn't mean you 
should.


In a single phrase, denormalize and flatten your design. Sure, that will 
lead to a lot of rows, but Solr and Lucene are designed to do well in that 
scenario.


If you are still linking in terms of C Struct, go for a long walk or do 
SOMETHING else until you can get that idea out of your head. It is a 
sub-optimal approach for exploiting the power of Lucene and Solr.


Stay with a static schema design until you hit... just stay with a static 
schema, period.


Dynamic fields and multi-valued fields do have value, but only when used in 
moderation - small numbers. If you start down a design path and find that 
you are heavily dependent on dynamic fields and/or multi-valued fields with 
large numbers of values per document, that is feedback that your design 
needs to be denormalized and flattened further.


-- Jack Krupansky

-Original Message- 
From: Dmitry Kan

Sent: Monday, May 20, 2013 7:06 AM
To: solr-user@lucene.apache.org
Subject: [custom data structure] aligned dynamic fields

Hi all,

Our current project requirement suggests that we should start storing
custom data structures in solr index. The custom data structure would be an
equivalent of C struct.

The task is as follows.

Suppose we have two types of fields, one is FieldName1 and the other
FieldName2.

Suppose also that we can have multiple pairs of these two fields on a
document in Solr.

That is, in notation of dynamic fields:

doc1
FieldName1_id1
FieldName2_id1

FieldName1_id2
FieldName2_id2

doc2
FieldName1_id3
FieldName2_id3

FieldName1_id4
FieldName2_id4

FieldName1_id5
FieldName2_id5

etc

What we would like to have is a value for the Field1_(some_unique_id) and a
value for Field2_(some_unique_id) as input for search. That is we wouldn't
care about the some_unique_id in some search scenarios. And the search
would automatically iterate the pairs of dynamic fields and respect the
pairings.

I know it used to be so, that with dynamic fields a client must provide the
dynamically generated field names coupled with their values up front when
searching.

What data structure / solution could be used as an alternative approach to
help such a structured search?

Thanks,

Dmitry