Re: Real-Time get and Dynamic Fields: possible bug.
Yep, but those dynamic fields had a field type "string", so the unique indexed therm will be the entire field value and the faceted terms counted will match with exactly with each field value. Thats why I was confused. Typically I use faceting with string non tokenized field values for simple stats and this kind of things. Do you think the behavior explained (I mean, ghost dynamic field values when using real-time request handler) can be a bug? I don' t mind investigating it this weekend and trying to patch it. 2015-05-14 18:59 GMT+02:00 Yonik Seeley : > On Thu, May 14, 2015 at 12:49 PM, Luis Cappa Banda > wrote: > > If you don' t mark as stored a field indexed and 'facetable', I was > > expecting to not be able to return their values, so faceting has no > sense. > > Faceting does not use or retrieve stored field values. The labels > faceting returns are from the indexed values. > > "If you want the value returned, it needs to be stored" only applies > to fields in the main document list (the fields that are retrieved for > the top ranked documents). > > -Yonik > -- - Luis Cappa
Re: Real-Time get and Dynamic Fields: possible bug.
On Thu, May 14, 2015 at 12:49 PM, Luis Cappa Banda wrote: > If you don' t mark as stored a field indexed and 'facetable', I was > expecting to not be able to return their values, so faceting has no sense. Faceting does not use or retrieve stored field values. The labels faceting returns are from the indexed values. "If you want the value returned, it needs to be stored" only applies to fields in the main document list (the fields that are retrieved for the top ranked documents). -Yonik
Re: Real-Time get and Dynamic Fields: possible bug.
That is something I didin' t know, but I thought it was mandatory. I' ll try to explain step by step my (I think) logical way to understand it: - If a field is indexed, you can search by it. - When faceting, you have to index the field (because it can be tokenized and then you would like to facet by their terms). Then, you need to mark as indexed those fields you want to facet by. - If you mark as stored a field, you can return its value with the 'original value' it was stored. - If you facet, you are searching, counting terms and returning values and their counters. Thus, that "returning their values" step is what I thought where 'stored=true' was necessary. If you don' t mark as stored a field indexed and 'facetable', I was expecting to not be able to return their values, so faceting has no sense. Thats what I thought, of course. If it is not necessary, thats perfect: the lighter the data, the better, and one more thing I' ve learned, :-) Anyway, I think that the question is still open: both are dynamic fields, stored (it is not necessary, OK) and indexed. When applying real time requestHandler, i18n* dynamic fields are returned but those *_facet are not. However, when applying the default /select requestHandler and finding by the document id, both i18n* and *_facet fields are returned. You can try it with Solr 5.1, the version I' m currently using. The only differences between them are: - Regular expression: i18n* VS *_facet - Multivalued: *_facet are multivalued. Regards, - Luis Cappa 2015-05-14 18:32 GMT+02:00 Yonik Seeley : > On Thu, May 14, 2015 at 10:47 AM, Luis Cappa Banda > wrote: > > Hi Yonik, > > > > Yes, they are the target from copyFields in the schema.xml. This *_target > > fields are suposed to be used in some specific searchable (thus, > tokenized) > > fields that in the future are candidates to be faceted to return some > > stats. For example, imagine that you have a field storing a directory > path > > and you want to search by. Also, you may want to facet by the whole > > directory path value (not just their terms). Thats why I' m storing both > > field values: searchable and tokenized one, string and 'facet candidate' > > one. > > OK, but you don't need to *store* the values in _facet, right? > -Yonik > -- - Luis Cappa
Re: Real-Time get and Dynamic Fields: possible bug.
On Thu, May 14, 2015 at 10:47 AM, Luis Cappa Banda wrote: > Hi Yonik, > > Yes, they are the target from copyFields in the schema.xml. This *_target > fields are suposed to be used in some specific searchable (thus, tokenized) > fields that in the future are candidates to be faceted to return some > stats. For example, imagine that you have a field storing a directory path > and you want to search by. Also, you may want to facet by the whole > directory path value (not just their terms). Thats why I' m storing both > field values: searchable and tokenized one, string and 'facet candidate' > one. OK, but you don't need to *store* the values in _facet, right? -Yonik
Re: Real-Time get and Dynamic Fields: possible bug.
Hi Yonik, Yes, they are the target from copyFields in the schema.xml. This *_target fields are suposed to be used in some specific searchable (thus, tokenized) fields that in the future are candidates to be faceted to return some stats. For example, imagine that you have a field storing a directory path and you want to search by. Also, you may want to facet by the whole directory path value (not just their terms). Thats why I' m storing both field values: searchable and tokenized one, string and 'facet candidate' one. What I do not understand is that both i18n* and *_target are dynamic, indexed and stored values. The only difference is that *_target one is multivalued. Does it have some sense? Regards - Luis Cappa 2015-05-14 16:42 GMT+02:00 Yonik Seeley : > Are the _facet fields the target of a copyField in the schema? > Realtime get either gets the values from the transaction log (and if > you didn't send it the values, they won't be there) or gets them from > the index to try and reconstruct what was sent in. > > It's generally not recommended to have copyField targets "stored", or > have a mix of explicitly set values and copyField values in the same > field. > > -Yonik > > On Thu, May 14, 2015 at 7:17 AM, Luis Cappa Banda > wrote: > > Hi there, > > > > I have the following dynamicFields definition in my schema.xml: > > > > > > > > > > > indexed= > > "true" stored="true" multiValued="true" /> > > > > > > I' ve seen that when fetching documents with /select?q=id:whateverId, the > > results returned include both i18n* and *_facet fields filled. However, > > when using real-time request handler (/get?ids:whateverIds) the result > > fetched include only i18n* dynamic fields, but *_facet ones are not > > included. > > > > I have the impression during /get RequestHandler the server-side regular > > expression used when parsing fields and fields values to return documents > > with existing dynamic fields seems to be wrong. From the client side, I' > ve > > checked that the class DocField.java that parses SolrDocument to Bean > ones > > uses the following matcher: > > > > } else if (annotation.value().indexOf('*') >= 0) { // dynamic fields are > > annotated as @Field("categories_*") > > > > // if the field was annotated as a dynamic field, convert the name into a > > pattern > > > > // the wildcard (*) is supposed to be either a prefix or a suffix, hence > > the use of replaceFirst > > > > name = annotation.value().replaceFirst("\\*", "\\.*"); > > > > dynamicFieldNamePatternMatcher = Pattern.compile("^" + name + "$"); > > > > } else { > > > > name = annotation.value(); > > > > } > > > > So maybe a similar behavior from the server-side is wrong. That' s the > only > > reason I find to understand why when using /select all fields are > returned > > but when using /get those that matches *_facet regexp are not. > > > > If you can confirm that this is a bug (because maybe is the expected > > behavior, but after some years using Solr I think it is not) I can create > > the JIRA issue and debug it more deeply to apply a patch with the aim to > > help. > > > > > > Regards, > > > > > > -- > > - Luis Cappa > -- - Luis Cappa
Re: Real-Time get and Dynamic Fields: possible bug.
Ehem, *_target ---> *_facet. 2015-05-14 16:47 GMT+02:00 Luis Cappa Banda : > Hi Yonik, > > Yes, they are the target from copyFields in the schema.xml. This *_target > fields are suposed to be used in some specific searchable (thus, tokenized) > fields that in the future are candidates to be faceted to return some > stats. For example, imagine that you have a field storing a directory path > and you want to search by. Also, you may want to facet by the whole > directory path value (not just their terms). Thats why I' m storing both > field values: searchable and tokenized one, string and 'facet candidate' > one. > > What I do not understand is that both i18n* and *_target are dynamic, > indexed and stored values. The only difference is that *_target one is > multivalued. Does it have some sense? > > > Regards > > > - Luis Cappa > > 2015-05-14 16:42 GMT+02:00 Yonik Seeley : > >> Are the _facet fields the target of a copyField in the schema? >> Realtime get either gets the values from the transaction log (and if >> you didn't send it the values, they won't be there) or gets them from >> the index to try and reconstruct what was sent in. >> >> It's generally not recommended to have copyField targets "stored", or >> have a mix of explicitly set values and copyField values in the same >> field. >> >> -Yonik >> >> On Thu, May 14, 2015 at 7:17 AM, Luis Cappa Banda >> wrote: >> > Hi there, >> > >> > I have the following dynamicFields definition in my schema.xml: >> > >> > >> > >> > >> > > /> > indexed= >> > "true" stored="true" multiValued="true" /> >> > >> > >> > I' ve seen that when fetching documents with /select?q=id:whateverId, >> the >> > results returned include both i18n* and *_facet fields filled. However, >> > when using real-time request handler (/get?ids:whateverIds) the result >> > fetched include only i18n* dynamic fields, but *_facet ones are not >> > included. >> > >> > I have the impression during /get RequestHandler the server-side regular >> > expression used when parsing fields and fields values to return >> documents >> > with existing dynamic fields seems to be wrong. From the client side, >> I' ve >> > checked that the class DocField.java that parses SolrDocument to Bean >> ones >> > uses the following matcher: >> > >> > } else if (annotation.value().indexOf('*') >= 0) { // dynamic fields >> are >> > annotated as @Field("categories_*") >> > >> > // if the field was annotated as a dynamic field, convert the name into >> a >> > pattern >> > >> > // the wildcard (*) is supposed to be either a prefix or a suffix, hence >> > the use of replaceFirst >> > >> > name = annotation.value().replaceFirst("\\*", "\\.*"); >> > >> > dynamicFieldNamePatternMatcher = Pattern.compile("^" + name + "$"); >> > >> > } else { >> > >> > name = annotation.value(); >> > >> > } >> > >> > So maybe a similar behavior from the server-side is wrong. That' s the >> only >> > reason I find to understand why when using /select all fields are >> returned >> > but when using /get those that matches *_facet regexp are not. >> > >> > If you can confirm that this is a bug (because maybe is the expected >> > behavior, but after some years using Solr I think it is not) I can >> create >> > the JIRA issue and debug it more deeply to apply a patch with the aim to >> > help. >> > >> > >> > Regards, >> > >> > >> > -- >> > - Luis Cappa >> > > > > -- > - Luis Cappa > -- - Luis Cappa
Re: Real-Time get and Dynamic Fields: possible bug.
Are the _facet fields the target of a copyField in the schema? Realtime get either gets the values from the transaction log (and if you didn't send it the values, they won't be there) or gets them from the index to try and reconstruct what was sent in. It's generally not recommended to have copyField targets "stored", or have a mix of explicitly set values and copyField values in the same field. -Yonik On Thu, May 14, 2015 at 7:17 AM, Luis Cappa Banda wrote: > Hi there, > > I have the following dynamicFields definition in my schema.xml: > > > > >"true" stored="true" multiValued="true" /> > > > I' ve seen that when fetching documents with /select?q=id:whateverId, the > results returned include both i18n* and *_facet fields filled. However, > when using real-time request handler (/get?ids:whateverIds) the result > fetched include only i18n* dynamic fields, but *_facet ones are not > included. > > I have the impression during /get RequestHandler the server-side regular > expression used when parsing fields and fields values to return documents > with existing dynamic fields seems to be wrong. From the client side, I' ve > checked that the class DocField.java that parses SolrDocument to Bean ones > uses the following matcher: > > } else if (annotation.value().indexOf('*') >= 0) { // dynamic fields are > annotated as @Field("categories_*") > > // if the field was annotated as a dynamic field, convert the name into a > pattern > > // the wildcard (*) is supposed to be either a prefix or a suffix, hence > the use of replaceFirst > > name = annotation.value().replaceFirst("\\*", "\\.*"); > > dynamicFieldNamePatternMatcher = Pattern.compile("^" + name + "$"); > > } else { > > name = annotation.value(); > > } > > So maybe a similar behavior from the server-side is wrong. That' s the only > reason I find to understand why when using /select all fields are returned > but when using /get those that matches *_facet regexp are not. > > If you can confirm that this is a bug (because maybe is the expected > behavior, but after some years using Solr I think it is not) I can create > the JIRA issue and debug it more deeply to apply a patch with the aim to > help. > > > Regards, > > > -- > - Luis Cappa
Real-Time get and Dynamic Fields: possible bug.
Hi there, I have the following dynamicFields definition in my schema.xml: I' ve seen that when fetching documents with /select?q=id:whateverId, the results returned include both i18n* and *_facet fields filled. However, when using real-time request handler (/get?ids:whateverIds) the result fetched include only i18n* dynamic fields, but *_facet ones are not included. I have the impression during /get RequestHandler the server-side regular expression used when parsing fields and fields values to return documents with existing dynamic fields seems to be wrong. From the client side, I' ve checked that the class DocField.java that parses SolrDocument to Bean ones uses the following matcher: } else if (annotation.value().indexOf('*') >= 0) { // dynamic fields are annotated as @Field("categories_*") // if the field was annotated as a dynamic field, convert the name into a pattern // the wildcard (*) is supposed to be either a prefix or a suffix, hence the use of replaceFirst name = annotation.value().replaceFirst("\\*", "\\.*"); dynamicFieldNamePatternMatcher = Pattern.compile("^" + name + "$"); } else { name = annotation.value(); } So maybe a similar behavior from the server-side is wrong. That' s the only reason I find to understand why when using /select all fields are returned but when using /get those that matches *_facet regexp are not. If you can confirm that this is a bug (because maybe is the expected behavior, but after some years using Solr I think it is not) I can create the JIRA issue and debug it more deeply to apply a patch with the aim to help. Regards, -- - Luis Cappa