Re: custom response writer which extends RawResponseWriter fails when shards > 1

Lee Carroll Thu, 19 Apr 2018 10:43:21 -0700

Hi,

I rewrote all of my tests to use SolrCloudTestCase rather than SolrTestCaseJ4
and was able to replicate the responsewriter issue and debug with a sharded
collection. It turned out the issue was not with my response writer really
but rather my config.


<requestHandler name="/content" class="solr.SearchHandler">
....

    <lst name="invariants">
        <str name="wt">content</str>
    </lst>

</requestHandler>

In cloud mode having wt as an invariant breaks the collation of results
from shards. Now I'm sure this is a common mistake which I've repeated
(blush) but I do sort of want to actually implement my request handler in
this way. Is their a way to have a request handler support a single
response writer but still work in cloud mode ?

Could this be considered a bug ?

Lee C

On 18 April 2018 at 13:13, Mikhail Khludnev <m...@apache.org> wrote:

> Injecting headers might require deeper customisation up to establishing own
> filter or so.
> Speaking regarding your own WT, there might be some issues because usually
> it's not a big deal to use one wt for responding user query like (wt=csv)
> and wt=javabin in internal communication between aggregator and slaves like
> it happens in wt=csv query.
>
> On Wed, Apr 18, 2018 at 2:19 PM, Lee Carroll <lee.a.carr...@googlemail.com
> >
> wrote:
>
> > Inventive. I need to control content-type of the response from the
> document
> > field value. I have the actual content field and the content-type field
> to
> > use configured in the response writer. I've just noticed that the xslt
> > transformer allows you to do this but not controlled by document values.
> I
> > may also need to set some headers based on content-type and perhaps
> content
> > size, accept ranges comes to mind. Although I might be getting ahead of
> > myself.
> >
> >
> >
> > On 18 April 2018 at 12:05, Mikhail Khludnev <m...@apache.org> wrote:
> >
> > > well ..
> > > what if
> > > http://localhost:8983/solr/images/select?fl=content&q=id:
> > 1&start=1&wt=csv&
> > > csv.separator=&csv.encapsulator&csv.null=null
> > > ?
> > >
> > > On Wed, Apr 18, 2018 at 1:18 PM, Lee Carroll <
> > lee.a.carr...@googlemail.com
> > > >
> > > wrote:
> > >
> > > > sorry cut n paste error i'd get
> > > >
> > > > {
> > > >   "responseHeader":{
> > > >     "zkConnected":true,
> > > >     "status":0,
> > > >     "QTime":0,
> > > >     "params":{
> > > >       "q":"*:*",
> > > >       "fl":"content",
> > > >       "rows":"1"}},
> > > >   "response":{"numFound":1,"start":0,"docs":[
> > > >       {
> > > >         "content":"my-content-value"}]
> > > >   }}
> > > >
> > > >
> > > > but you get my point
> > > >
> > > >
> > > >
> > > > On 18 April 2018 at 11:13, Lee Carroll <lee.a.carr...@googlemail.com
> >
> > > > wrote:
> > > >
> > > > > for http://localhost:8983/solr/images/select?fl=content&q=id:
> > 1&start=1
> > > > >
> > > > > I'd get
> > > > >
> > > > > {
> > > > >   "responseHeader":{
> > > > >     "zkConnected":true,
> > > > >     "status":0,
> > > > >     "QTime":1,
> > > > >     "params":{
> > > > >       "q":"*:*",
> > > > >       "_":"1524046333220"}},
> > > > >   "response":{"numFound":1,"start":0,"docs":[
> > > > >       {
> > > > >         "id":"1",
> > > > >         "content":"my-content-value",
> > > > >         "*content-type*":"text/plain"}]
> > > > >   }}
> > > > >
> > > > > when i want
> > > > >
> > > > > my-content-value
> > > > >
> > > > >
> > > > >
> > > > > On 18 April 2018 at 10:55, Mikhail Khludnev <m...@apache.org>
> wrote:
> > > > >
> > > > >> Lee, from this description I don see why it can't be addressed by
> > > > fl,rows
> > > > >> params. What makes it different form the typical Solr usage?
> > > > >>
> > > > >>
> > > > >> On Wed, Apr 18, 2018 at 12:31 PM, Lee Carroll <
> > > > >> lee.a.carr...@googlemail.com>
> > > > >> wrote:
> > > > >>
> > > > >> > Sure, we want to return a single field's value for the top
> > matching
> > > > >> > document for a given query. Bare content rather than a full
> search
> > > > >> result
> > > > >> > listing.
> > > > >> >
> > > > >> > To be concrete:
> > > > >> >
> > > > >> > For a schema of fields id [unique key],
> > > content[stored],content-type[
> > > > >> > stored]
> > > > >> > For a request:
> > > > >> >
> > > > >> >    1. Request URL:
> > > > >> >    https://localhost/solr/content?q=id:1
> > > > >> >    2. Request Method:
> > > > >> >    GET
> > > > >> >
> > > > >> > We get a response
> > > > >> > HTTP/1.1 200 OK Content-Length: 16261 Content-Type:
> [content-type
> > > > value]
> > > > >> >
> > > > >> > and the body to be the raw value of content
> > > > >> >
> > > > >> > In short clients consume directly the most relevant "content"
> > > returned
> > > > >> from
> > > > >> > solr queries they construct.
> > > > >> >
> > > > >> > Naively I've implemented a subclass of RawResponseWriter which
> > takes
> > > > the
> > > > >> > first docs values and adds them to the appended "content"
> stream.
> > > > >> Should I
> > > > >> > selectively add the content stream depending on if this is the
> > final
> > > > >> > aggregation of cloud results (and provide a base class writer to
> > act
> > > > if
> > > > >> > not), if so how do I know its the final aggregation. Or is
> adding
> > > the
> > > > >> > content stream within the response writer a bad idea. Should
> that
> > be
> > > > >> being
> > > > >> > added to the response somewhere else?
> > > > >> >
> > > > >> > Failing all of the above is asking about response writer an X /
> Y
> > > > >> problem.
> > > > >> > Is their a better way to achieve the above. I'd looked at
> > > transforming
> > > > >> > response xml but that seemed not to offer a complete bare slate.
> > > > >> >
> > > > >> > Cheers Lee C
> > > > >> >
> > > > >> >
> > > > >> > On 17 April 2018 at 21:36, Mikhail Khludnev <m...@apache.org>
> > > wrote:
> > > > >> >
> > > > >> > > In distributed search response writer is used twice
> > > > >> > > https://lucene.apache.org/solr/guide/7_1/distributed-
> > > requests.html
> > > > >> > > once slave node that's where response writer yields "json"
> > content
> > > > >> and it
> > > > >> > > upset aggregator node which is expect only javabin.
> > > > >> > > I hardly can comment on rrw, it's probably used for responding
> > > > >> separate
> > > > >> > > files in distrib=false mode.
> > > > >> > > You can start from describing why you need to create own
> > response
> > > > >> writer.
> > > > >> > >
> > > > >> > > On Tue, Apr 17, 2018 at 7:02 PM, Lee Carroll <
> > > > >> > lee.a.carr...@googlemail.com
> > > > >> > > >
> > > > >> > > wrote:
> > > > >> > >
> > > > >> > > > Ok. My expectation was the response writer would not be used
> > > until
> > > > >> the
> > > > >> > > > final serialization of the result. If my response writer
> > breaks
> > > > the
> > > > >> > > > response writer contract, exactly the way rawResponseWriter
> > does
> > > > and
> > > > >> > just
> > > > >> > > > out puts a filed value how does that work? Does
> > > rawResponseWriter
> > > > >> > support
> > > > >> > > > cloud mode?
> > > > >> > > >
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > On 17 April 2018 at 15:55, Mikhail Khludnev <
> m...@apache.org>
> > > > >> wrote:
> > > > >> > > >
> > > > >> > > > > That's what should happen.
> > > > >> > > > >
> > > > >> > > > > Expected mime type application/octet-stream but got
> > > > >> application/json.
> > > > >> > > > >
> > > > >> > > > > Distributed search coordinator expect to merge slave
> > responses
> > > > in
> > > > >> > > javabin
> > > > >> > > > > format. But slave's wt indicated json.
> > > > >> > > > > As far as I know only javabin might be used to distributed
> > > > search
> > > > >> > > > > underneath. Coordinator itself might yield json.
> > > > >> > > > >
> > > > >> > > > > On Tue, Apr 17, 2018 at 4:23 PM, Lee Carroll <
> > > > >> > > > lee.a.carr...@googlemail.com
> > > > >> > > > > >
> > > > >> > > > > wrote:
> > > > >> > > > >
> > > > >> > > > > > Sure
> > > > >> > > > > >
> > > > >> > > > > > with 1 shard 1 replica this request works fine
> > > > >> > > > > >
> > > > >> > > > > >    1. Request URL:
> > > > >> > > > > >    http://localhost:8983/solr/images/image?q=id:1
> > > > >> > > > > >    2. Request Method:
> > > > >> > > > > >    GET
> > > > >> > > > > >    3. Status Code:
> > > > >> > > > > >    200 OK
> > > > >> > > > > >
> > > > >> > > > > > logs are clean
> > > > >> > > > > >
> > > > >> > > > > > with 2 shards 2 replicas the same request fails and in
> the
> > > > logs
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > > INFO  - 2018-04-17 13:20:32.052; [c:images s:shard2
> > > > r:core_node7
> > > > >> > > > > > x:images_shard2_replica_n4]
> org.apache.solr.core.SolrCore;
> > > > >> > > > > > [images_shard2_replica_n4]  webapp=/solr path=/image
> > > > >> > > > > > params={df=text&distrib=false&qt=/image&fl=id&fl=score&
> > > > >> > > > > > shards.purpose=4&start=0&fsv=true&shard.url=
> > > > >> > > > > > http://10.224.30.207:8983/
> solr/images_shard2_replica_n4/
> > > > >> > > > > > |http://10.224.30.207:7574/
> solr/images_shard2_replica_n6/
> > > > >> > > > > > &rows=10&version=2&q=id:1&NOW=
> > > 1523971232039&isShard=true&wt=
> > > > >> > javabin}
> > > > >> > > > > > hits=0 status=0 QTime=0
> > > > >> > > > > > ERROR - 2018-04-17 13:20:32.055; [c:images s:shard1
> > > > r:core_node3
> > > > >> > > > > > x:images_shard1_replica_n1]
> org.apache.solr.common.SolrExc
> > > > >> eption;
> > > > >> > > > > > org.apache.solr.client.solrj.impl.HttpSolrClient$
> > > > >> > > RemoteSolrException:
> > > > >> > > > > > Error
> > > > >> > > > > > from server at http://10.224.30.207:8983/
> > > > >> > > solr/images_shard2_replica_n4
> > > > >> > > > :
> > > > >> > > > > > Expected mime type application/octet-stream but got
> > > > >> > application/json.
> > > > >> > > > > > at
> > > > >> > > > > > org.apache.solr.client.solrj.impl.HttpSolrClient.
> > > > >> > > > > > executeMethod(HttpSolrClient.java:607)
> > > > >> > > > > > at
> > > > >> > > > > > org.apache.solr.client.solrj.
> impl.HttpSolrClient.request(
> > > > >> > > > > > HttpSolrClient.java:255)
> > > > >> > > > > > at
> > > > >> > > > > > org.apache.solr.client.solrj.
> impl.HttpSolrClient.request(
> > > > >> > > > > > HttpSolrClient.java:244)
> > > > >> > > > > > at
> > > > >> > > > > > org.apache.solr.client.solrj.impl.LBHttpSolrClient.
> > > > >> > > > > > doRequest(LBHttpSolrClient.java:483)
> > > > >> > > > > > at
> > > > >> > > > > > org.apache.solr.client.solrj.
> > impl.LBHttpSolrClient.request(
> > > > >> > > > > > LBHttpSolrClient.java:413)
> > > > >> > > > > > at
> > > > >> > > > > > org.apache.solr.handler.component.
> > HttpShardHandlerFactory.
> > > > >> > > > > > makeLoadBalancedRequest(HttpShardHandlerFactory.java:
> 273)
> > > > >> > > > > > at
> > > > >> > > > > > org.apache.solr.handler.component.HttpShardHandler.
> > > > >> > lambda$submit$0(
> > > > >> > > > > > HttpShardHandler.java:175)
> > > > >> > > > > > at java.util.concurrent.FutureTask.run(FutureTask.
> > java:266)
> > > > >> > > > > > at java.util.concurrent.Executors$RunnableAdapter.
> > > > >> > > > > call(Executors.java:511)
> > > > >> > > > > > at java.util.concurrent.FutureTask.run(FutureTask.
> > java:266)
> > > > >> > > > > > at
> > > > >> > > > > > com.codahale.metrics.InstrumentedExecutorService$
> > > > >> > > > > InstrumentedRunnable.run(
> > > > >> > > > > > InstrumentedExecutorService.java:176)
> > > > >> > > > > > at
> > > > >> > > > > > org.apache.solr.common.util.ExecutorUtil$
> > > > >> > MDCAwareThreadPoolExecutor.
> > > > >> > > > > > lambda$execute$0(ExecutorUtil.java:188)
> > > > >> > > > > > at
> > > > >> > > > > > java.util.concurrent.ThreadPoolExecutor.runWorker(
> > > > >> > > > > > ThreadPoolExecutor.java:1142)
> > > > >> > > > > > at
> > > > >> > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(
> > > > >> > > > > > ThreadPoolExecutor.java:617)
> > > > >> > > > > > at java.lang.Thread.run(Thread.java:745)
> > > > >> > > > > >
> > > > >> > > > > > INFO  - 2018-04-17 13:20:32.056; [c:images s:shard1
> > > > r:core_node3
> > > > >> > > > > > x:images_shard1_replica_n1]
> org.apache.solr.core.SolrCore;
> > > > >> > > > > > [images_shard1_replica_n1]  webapp=/solr path=/image
> > > > >> > params={q=id:1}
> > > > >> > > > > > status=200 QTime=17
> > > > >> > > > > > INFO  - 2018-04-17 13:20:32.055; [c:images s:shard1
> > > > r:core_node3
> > > > >> > > > > > x:images_shard1_replica_n1]
> org.apache.solr.core.SolrCore;
> > > > >> > > > > > [images_shard1_replica_n1]  webapp=/solr path=/image
> > > > >> > > > > > params={df=text&distrib=false&qt=/image&fl=id&fl=score&
> > > > >> > > > > > shards.purpose=4&start=0&fsv=true&shard.url=
> > > > >> > > > > > http://10.224.30.207:8983/
> solr/images_shard1_replica_n1/
> > > > >> > > > > > |http://10.224.30.207:7574/
> solr/images_shard1_replica_n2/
> > > > >> > > > > > &rows=10&version=2&q=id:1&NOW=
> > > 1523971232039&isShard=true&wt=
> > > > >> > javabin}
> > > > >> > > > > > hits=1 status=0 QTime=2
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > > I've implemented getcontenttype simply as
> > > > >> > > > > >
> > > > >> > > > > > @Override
> > > > >> > > > > > public String getContentType(SolrQueryRequest request,
> > > > >> > > > > > SolrQueryResponse response) {
> > > > >> > > > > >
> > > > >> > > > > >     return "application/json;charset=utf-8";
> > > > >> > > > > > }
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > >
> > > > >> > > > > > On 16 April 2018 at 17:37, Mikhail Khludnev <
> > > m...@apache.org>
> > > > >> > wrote:
> > > > >> > > > > >
> > > > >> > > > > > > Lee,
> > > > >> > > > > > > It's worth to send a stacktrace for such kind of
> > > inquiries.
> > > > >> > > > > > > I guess it goes from QueryComponent.mergeIds() or so.
> > > Shard
> > > > >> > > response
> > > > >> > > > > > should
> > > > >> > > > > > > contains <uniqueKey> from schema.xml field.
> > > > >> > > > > > > I encounter something like this while troubleshooting
> > > > >> > > > > > > https://lucene.apache.org/
> solr/guide/6_6/transforming-
> > > > >> > > > > > > result-documents.html#TransformingResultDocuments-
> > > > >> > > > > > > CoresandCollectionsinSolrCloud
> > > > >> > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > > > On Mon, Apr 16, 2018 at 6:56 PM, Lee Carroll <
> > > > >> > > > > > lee.a.carr...@googlemail.com
> > > > >> > > > > > > >
> > > > >> > > > > > > wrote:
> > > > >> > > > > > >
> > > > >> > > > > > > > I've created a custom response writer which extends
> > > > >> > > > > RawResponseWriter.
> > > > >> > > > > > > The
> > > > >> > > > > > > > basic operation is to output a single field value
> from
> > > the
> > > > >> top
> > > > >> > > > > matching
> > > > >> > > > > > > doc
> > > > >> > > > > > > > as the entire response. This works when shards = 1
> but
> > > > fails
> > > > >> > when
> > > > >> > > > > > shards
> > > > >> > > > > > > > are greater than 1.
> > > > >> > > > > > > >
> > > > >> > > > > > > > I throw an error if the field in question is missing
> > > from
> > > > >> the
> > > > >> > top
> > > > >> > > > > doc.
> > > > >> > > > > > > This
> > > > >> > > > > > > > happens when individual shards are being searched
> and
> > > only
> > > > >> id
> > > > >> > and
> > > > >> > > > > score
> > > > >> > > > > > > are
> > > > >> > > > > > > > returned. I'm sure I've committed a basic error.
> > > > >> > > > > > > >
> > > > >> > > > > > > > Lee C
> > > > >> > > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > > >
> > > > >> > > > > > > --
> > > > >> > > > > > > Sincerely yours
> > > > >> > > > > > > Mikhail Khludnev
> > > > >> > > > > > >
> > > > >> > > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > > --
> > > > >> > > > > Sincerely yours
> > > > >> > > > > Mikhail Khludnev
> > > > >> > > > >
> > > > >> > > >
> > > > >> > >
> > > > >> > >
> > > > >> > >
> > > > >> > > --
> > > > >> > > Sincerely yours
> > > > >> > > Mikhail Khludnev
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> Sincerely yours
> > > > >> Mikhail Khludnev
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Sincerely yours
> > > Mikhail Khludnev
> > >
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>

Re: custom response writer which extends RawResponseWriter fails when shards > 1

Reply via email to