[ 
https://issues.apache.org/jira/browse/SOLR-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12734375#action_12734375
 ] 

Matt Schraeder commented on SOLR-1270:
--------------------------------------

I just tried to replicate the problem I was having on a super small, fresh, 
install of solr, and I can't duplicate the problem either.  Which means I have 
no idea what I was doing to cause that.  There's a chance that since the data 
that I saw this happening with was coming from a database that there was some 
sort of float->string or string->float screwup when it was coming out of the 
database into PHP.  I might try and replicate the problem at work and narrow 
down what was going on a bit further, but I'll just assume I've been wrong 
about what was going on this whole time. My apologies to Hoss Man.

Since it seems this is all just invalid in, invalid out and I was mistaken 
about what was going on, I do think clarification needs to happen somewhere.  
As a newer user to Solr, I was confused by two things:

1) The example schema.xml's wording is vague for these fields.  When FloatField 
says it stores/indexes the "text value verbatim" it isn't easily understood 
that this also means it does not have to be a valid float. I understood this as 
meaning it's still storing/indexing a valid float, and the only difference 
between it and a sortable version was that FloatField is sorted as text 
(1,10,11,2,23,3 rather than 1,2,3,10,11,23) and not numerically.  Instead, this 
should clarify that it is identical to StrField with the exception in how it is 
output assuming the data is a valid integer, and warn the user that no 
verification of the input is done. It also would help if this clarifies that it 
is meant for legacy support and that most users should be using 
SortableFloatField.

2) My first index build was done with sfloat.  After the first build was 
finished, I opened the index in Luke to verify things were being stored and 
values were being displayed as I expected. When looking at the stored documents 
in Luke, it was unable to display the sfloat fields, which led me to believe 
that the sortable float field also wouldn't be readable when I went to display 
the output from Solr.  This is when I added a copy field so that I would have 
both a float and an sfloat thinking I had to search/sort on sfloat, but display 
the float.  This is when I began having problems.  The Sortable fields should 
clarify that Solr's internal form isn't readable and if lucene apps other than 
solr try and read those fields it will not be readable, but that Solr's output 
will be human readable.

I would recommend reordering the fields in the example, so that the first field 
type a new user comes across is the sortable version.  Comment out the 
non-sortable versions and say they are for legacy support, and to uncomment if 
for some reason you need to use them. Also warn about sane inputs/outputs.

> The FloatField (and probably others) field type takes any string value at 
> index, but JSON writer outputs as numeric without checking
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1270
>                 URL: https://issues.apache.org/jira/browse/SOLR-1270
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 1.2, 1.3, 1.4
>         Environment: ubuntu 8.04, sun java 6, tomcat 5.5
>            Reporter: Donovan Jimenez
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-1270.patch
>
>
> The FloatField field type takes any string value at index. These values 
> aren't necessarily in JSON numeric, but the JSON writer does not check its 
> validity before writing it out as a JSON numeric.
> I'm aware of the SortableFloatField which does do index time verification and 
> conversion of the value, but the way the JSON writer is working seemed like 
> either a bug that needed addressed or perhaps a gotch that needs better 
> documented?
> This issue originally came from my php client issue tracker: 
> http://code.google.com/p/solr-php-client/issues/detail?id=13

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to