[ https://issues.apache.org/jira/browse/SOLR-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892429#comment-13892429 ]
Hoss Man commented on SOLR-5698: -------------------------------- Easy steps to reproduce using the example configs... {noformat} hossman@frisbee:~$ perl -le 'print "a,aaa"; print "z," . ("Z" x 32767);' | curl 'http://localhost:8983/solr/update?header=false&fieldnames=name,long_s&rowid=id&commit=true' -H 'Content-Type: application/csv' --data-binary @- <?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"><int name="status">0</int><int name="QTime">572</int></lst> </response> hossman@frisbee:~$ curl 'http://localhost:8983/solr/select?q=*:*&fl=id,name&wt=json&indent=true'{ "responseHeader":{ "status":0, "QTime":12, "params":{ "fl":"id,name", "indent":"true", "q":"*:*", "wt":"json"}}, "response":{"numFound":2,"start":0,"docs":[ { "name":"a", "id":"0"}, { "name":"z", "id":"1"}] }} hossman@frisbee:~$ curl 'http://localhost:8983/solr/select?q=long_s:*&wt=json&indent=true' { "responseHeader":{ "status":0, "QTime":1, "params":{ "indent":"true", "q":"long_s:*", "wt":"json"}}, "response":{"numFound":1,"start":0,"docs":[ { "name":"a", "long_s":"aaa", "id":"0", "_version_":1459225819107819520}] }} {noformat} > exceptionally long terms are silently ignored during indexing > ------------------------------------------------------------- > > Key: SOLR-5698 > URL: https://issues.apache.org/jira/browse/SOLR-5698 > Project: Solr > Issue Type: Bug > Reporter: Hoss Man > > As reported on the user list, when a term is greater then 2^15 bytes it is > silently ignored at indexing time -- no error is given at all. > we should investigate: > * if there is a way to get the lower level lucene code to propogate up an > error we can return to the user instead of silently ignoring these terms > * if there is no way to generate a low level error: > ** is there at least way to make this limit configurable so it's more obvious > to users that this limit exists? > ** should we make things like StrField do explicit size checking on the terms > they produce and explicitly throw their own error? -- This message was sent by Atlassian JIRA (v6.1.5#6160) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org