Thanks for the explanation. Yes, all my join keys are the same, so I think both should be ok too.
All my 3 collections have a lot of records, but for my last collection, I'm only extracting a few of the fields (about 5) to be shown. So does this considered that I have three very large joins? Regards, Edwin On 5 May 2017 at 23:37, Joel Bernstein <joels...@gmail.com> wrote: > *:* queries will work fine for the innerJoin, which is a merge join that > never runs out of memory. The hashJoin read the entire "hashed" query into > memory though, so there are limitations. > > So if you have three very large joins that require *:* then the hashJoin > approach will be problematic. In that case you could use fetch() around the > innerJoin to do the third join. > > parallel(fetch(innerJoin(search(), search()))) > > Or if the hashJoin uses the same key as the innerJoin you can do the > hashJoin in parallel as well and partition the "hashed" search across the > workers: > > parallel(hashJoin(innerJoin(search(), search()), hashed=search()))) > > In this case the "hashed" search partitionKeys would be the same as the > innerJoin searches. But the join keys must be same for this scenario to > work. > > > > > Joel Bernstein > http://joelsolr.blogspot.com/ > > On Fri, May 5, 2017 at 11:17 AM, Zheng Lin Edwin Yeo <edwinye...@gmail.com > > > wrote: > > > I found that using *:* will return the entire resultset, and cause the > > result from the join query to blow up. > > > > Like if from the query, there are 2 results in collection1, and 3 results > > in collection2, I found that there could be 6 results that will be > returned > > in the join query (using hashJoin or innerJoin). > > > > Is that correct? > > > > Regards, > > Edwin > > > > > > On 5 May 2017 at 07:17, Zheng Lin Edwin Yeo <edwinye...@gmail.com> > wrote: > > > > > Hi Joel, > > > > > > Yes, the /export works after I remove the /export handler from > > > solrconfig.xml. Thanks for the advice. > > > > > > For *:*, there will be result returned when using /export. > > > But if one of the queries is *:*, this means the entire resultset will > > > contains all the records from the query which has *:*? > > > > > > Regards, > > > Edwin > > > > > > > > > On 5 May 2017 at 01:46, Joel Bernstein <joels...@gmail.com> wrote: > > > > > >> No *:* will simply return all the results from one of the queries. It > > >> should still join properly. If you are using the /select handler joins > > >> will > > >> not work properly. > > >> > > >> > > >> This example worked properly for me: > > >> > > >> hashJoin(parallel(collection2, j > > >> workers=3, > > >> sort="id asc", > > >> innerJoin(search(collection2, q="*:*", > > >> fl="id", > > >> sort="id asc", qt="/export", partitionKeys="id"), > > >> search(collection2, > > >> q="year_i:42", fl="id, year_i", sort="id asc", qt="/export", > > >> partitionKeys="id"), > > >> on="id")), > > >> hashed=search(collection2, q="day_i:7", fl="id, > day_i", > > >> sort="id asc", qt="/export"), > > >> on="id") > > >> > > >> > > >> > > >> > > >> Joel Bernstein > > >> http://joelsolr.blogspot.com/ > > >> > > >> On Thu, May 4, 2017 at 12:28 PM, Zheng Lin Edwin Yeo < > > >> edwinye...@gmail.com> > > >> wrote: > > >> > > >> > Hi Joel, > > >> > > > >> > For the join queries, is it true that if we use q=*:* for the query > > for > > >> one > > >> > of the join, there will not be any results return? > > >> > > > >> > Currently I found this is the case, if I just put q=*:*. > > >> > > > >> > Regards, > > >> > Edwin > > >> > > > >> > > > >> > On 4 May 2017 at 23:38, Zheng Lin Edwin Yeo <edwinye...@gmail.com> > > >> wrote: > > >> > > > >> > > Hi Joel, > > >> > > > > >> > > I think that might be one of the reason. > > >> > > This is what I have for the /export handler in my solrconfig.xml > > >> > > > > >> > > <requestHandler name="/export" class="solr.SearchHandler"> <lst > > name= > > >> > > "invariants"> <str name="rq">{!xport}</str> <str > > >> name="wt">xsort</str> < > > >> > > str name="distrib">false</str> </lst> <arr name="components"> > > >> > <str>query</ > > >> > > str> </arr> </requestHandler> > > >> > > > > >> > > This is the error message that I get when I use the /export > handler. > > >> > > > > >> > > java.io.IOException: java.util.concurrent.ExecutionException: > > >> > > java.io.IOException: --> http://localhost:8983/solr/ > > >> > > collection1_shard1_replica1/: An exception has occurred on the > > server, > > >> > > refer to server log for details. > > >> > > at org.apache.solr.client.solrj.io.stream.CloudSolrStream. > > >> > > openStreams(CloudSolrStream.java:451) > > >> > > at org.apache.solr.client.solrj.io.stream.CloudSolrStream. > > >> > > open(CloudSolrStream.java:308) > > >> > > at org.apache.solr.client.solrj.io.stream.PushBackStream.open( > > >> > > PushBackStream.java:70) > > >> > > at org.apache.solr.client.solrj.io.stream.JoinStream.open( > > >> > > JoinStream.java:147) > > >> > > at org.apache.solr.client.solrj.io.stream.ExceptionStream. > > >> > > open(ExceptionStream.java:51) > > >> > > at org.apache.solr.handler.StreamHandler$TimerStream. > > >> > > open(StreamHandler.java:457) > > >> > > at org.apache.solr.client.solrj.io.stream.TupleStream. > > >> > > writeMap(TupleStream.java:63) > > >> > > at org.apache.solr.response.JSONWriter.writeMap( > > >> > > JSONResponseWriter.java:547) > > >> > > at org.apache.solr.response.TextResponseWriter.writeVal( > > >> > > TextResponseWriter.java:193) > > >> > > at org.apache.solr.response.JSONWriter. > writeNamedListAsMapWithDups( > > >> > > JSONResponseWriter.java:209) > > >> > > at org.apache.solr.response.JSONWriter.writeNamedList( > > >> > > JSONResponseWriter.java:325) > > >> > > at org.apache.solr.response.JSONWriter.writeResponse( > > >> > > JSONResponseWriter.java:120) > > >> > > at org.apache.solr.response.JSONResponseWriter.write( > > >> > > JSONResponseWriter.java:71) > > >> > > at org.apache.solr.response.QueryResponseWriterUtil.writeQueryR > > >> esponse( > > >> > > QueryResponseWriterUtil.java:65) > > >> > > at org.apache.solr.servlet.HttpSolrCall.writeResponse( > > >> > > HttpSolrCall.java:732) > > >> > > at org.apache.solr.servlet.HttpSolrCall.call( > HttpSolrCall.java:473) > > >> > > at org.apache.solr.servlet.SolrDispatchFilter.doFilter( > > >> > > SolrDispatchFilter.java:345) > > >> > > at org.apache.solr.servlet.SolrDispatchFilter.doFilter( > > >> > > SolrDispatchFilter.java:296) > > >> > > at org.eclipse.jetty.servlet.ServletHandler$CachedChain. > > >> > > doFilter(ServletHandler.java:1691) > > >> > > at org.eclipse.jetty.servlet.ServletHandler.doHandle( > > >> > > ServletHandler.java:582) > > >> > > at org.eclipse.jetty.server.handler.ScopedHandler.handle( > > >> > > ScopedHandler.java:143) > > >> > > at org.eclipse.jetty.security.SecurityHandler.handle( > > >> > > SecurityHandler.java:548) > > >> > > at org.eclipse.jetty.server.session.SessionHandler. > > >> > > doHandle(SessionHandler.java:226) > > >> > > at org.eclipse.jetty.server.handler.ContextHandler. > > >> > > doHandle(ContextHandler.java:1180) > > >> > > at org.eclipse.jetty.servlet.ServletHandler.doScope( > > >> > > ServletHandler.java:512) > > >> > > at org.eclipse.jetty.server.session.SessionHandler. > > >> > > doScope(SessionHandler.java:185) > > >> > > at org.eclipse.jetty.server.handler.ContextHandler. > > >> > > doScope(ContextHandler.java:1112) > > >> > > at org.eclipse.jetty.server.handler.ScopedHandler.handle( > > >> > > ScopedHandler.java:141) > > >> > > at org.eclipse.jetty.server.handler.ContextHandlerCollection. > > handle( > > >> > > ContextHandlerCollection.java:213) > > >> > > at org.eclipse.jetty.server.handler.HandlerCollection. > > >> > > handle(HandlerCollection.java:119) > > >> > > at org.eclipse.jetty.server.handler.HandlerWrapper.handle( > > >> > > HandlerWrapper.java:134) > > >> > > at org.eclipse.jetty.server.Server.handle(Server.java:534) > > >> > > at org.eclipse.jetty.server.HttpChannel.handle( > > HttpChannel.java:320) > > >> > > at org.eclipse.jetty.server.HttpConnection.onFillable( > > >> > > HttpConnection.java:251) > > >> > > at org.eclipse.jetty.io.AbstractConnection$ > ReadCallback.succeeded( > > >> > > AbstractConnection.java:273) > > >> > > at org.eclipse.jetty.io.FillInterest.fillable( > FillInterest.java:95) > > >> > > at org.eclipse.jetty.io.SelectChannelEndPoint$2.run( > > >> > > SelectChannelEndPoint.java:93) > > >> > > at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume. > > >> > > executeProduceConsume(ExecuteProduceConsume.java:303) > > >> > > at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume. > > >> > > produceConsume(ExecuteProduceConsume.java:148) > > >> > > at org.eclipse.jetty.util.thread.strategy. > > ExecuteProduceConsume.run( > > >> > > ExecuteProduceConsume.java:136) > > >> > > at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob( > > >> > > QueuedThreadPool.java:671) > > >> > > at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run( > > >> > > QueuedThreadPool.java:589) > > >> > > at java.lang.Thread.run(Thread.java:745) > > >> > > Caused by: java.util.concurrent.ExecutionException: > > >> java.io.IOException: > > >> > > --> http://localhost:8983/solr/collection1_shard1_replica1/: An > > >> > exception > > >> > > has occurred on the server, refer to server log for details. > > >> > > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > > >> > > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > > >> > > at org.apache.solr.client.solrj.io.stream.CloudSolrStream. > > >> > > openStreams(CloudSolrStream.java:445) > > >> > > ... 42 more > > >> > > Caused by: java.io.IOException: --> http://localhost:8983/solr/ > > >> > > collection1_shard1_replica1/: An exception has occurred on the > > server, > > >> > > refer to server log for details. > > >> > > at org.apache.solr.client.solrj.io.stream.SolrStream.read( > > >> > > SolrStream.java:238) > > >> > > at org.apache.solr.client.solrj.io.stream.CloudSolrStream$ > > >> > > TupleWrapper.next(CloudSolrStream.java:541) > > >> > > at org.apache.solr.client.solrj.io.stream.CloudSolrStream$ > > >> > > StreamOpener.call(CloudSolrStream.java:564) > > >> > > at org.apache.solr.client.solrj.io.stream.CloudSolrStream$ > > >> > > StreamOpener.call(CloudSolrStream.java:551) > > >> > > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > > >> > > at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolE > > >> xecutor. > > >> > > lambda$execute$0(ExecutorUtil.java:229) > > >> > > at java.util.concurrent.ThreadPoolExecutor.runWorker( > > >> > > ThreadPoolExecutor.java:1142) > > >> > > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > > >> > > ThreadPoolExecutor.java:617) > > >> > > ... 1 more > > >> > > Caused by: org.noggit.JSONParser$ParseException: JSON Parse > Error: > > >> > > char=<,position=0 BEFORE='<' AFTER='?xml version="1.0" > > >> > encoding="UTF-8"?> <' > > >> > > at org.noggit.JSONParser.err(JSONParser.java:356) > > >> > > at org.noggit.JSONParser.handleNonDoubleQuoteString(JSONParser. > > >> java:712) > > >> > > at org.noggit.JSONParser.next(JSONParser.java:886) > > >> > > at org.noggit.JSONParser.nextEvent(JSONParser.java:930) > > >> > > at org.apache.solr.client.solrj.io.stream.JSONTupleStream. > > >> > > expect(JSONTupleStream.java:97) > > >> > > at org.apache.solr.client.solrj.io.stream.JSONTupleStream. > > >> > > advanceToDocs(JSONTupleStream.java:179) > > >> > > at org.apache.solr.client.solrj.io.stream.JSONTupleStream. > > >> > > next(JSONTupleStream.java:77) > > >> > > at org.apache.solr.client.solrj.io.stream.SolrStream.read( > > >> > > SolrStream.java:207) > > >> > > ... 8 more > > >> > > > > >> > > > > >> > > Regards, > > >> > > Edwin > > >> > > > > >> > > > > >> > > On 4 May 2017 at 22:54, Joel Bernstein <joels...@gmail.com> > wrote: > > >> > > > > >> > >> I suspect that there is something not quite right about the how > the > > >> > >> /export > > >> > >> handler is configured. Straight out of the box in solr 6.4.2 > > /export > > >> > will > > >> > >> be automatically configured. Are you using a Solr instance that > has > > >> been > > >> > >> upgraded in the past and doesn't have standard 6.4.2 configs? > > >> > >> > > >> > >> To really do joins properly you'll have to use the /export > handler > > >> > because > > >> > >> /select will not stream entire result sets (unless they are > pretty > > >> > small). > > >> > >> So your results will be missing data possibly. > > >> > >> > > >> > >> I would take a close look at the logs and see what all the > > exceptions > > >> > are > > >> > >> when you run the a search using qt=/export. If you can post all > the > > >> > stack > > >> > >> traces that get generated when you run the search we'll probably > be > > >> able > > >> > >> to > > >> > >> spot the issue. > > >> > >> > > >> > >> About the field ordering. There is support for field ordering in > > the > > >> > >> Streaming classes but only a few places actually enforce the > order. > > >> The > > >> > >> 6.5 > > >> > >> SQL interface does keep the fields in order as does the new Tuple > > >> > >> expression in Solr 6.6. But the expressions you are working with > > >> > currently > > >> > >> don't enforce field ordering. > > >> > >> > > >> > >> > > >> > >> > > >> > >> > > >> > >> Joel Bernstein > > >> > >> http://joelsolr.blogspot.com/ > > >> > >> > > >> > >> On Thu, May 4, 2017 at 2:41 AM, Zheng Lin Edwin Yeo < > > >> > edwinye...@gmail.com > > >> > >> > > > >> > >> wrote: > > >> > >> > > >> > >> > Hi Joel, > > >> > >> > > > >> > >> > I have managed to get the Join to work, but so far it is only > > >> working > > >> > >> when > > >> > >> > I use qt="/select". It is not working when I use qt="/export". > > >> > >> > > > >> > >> > For the display of the field, is there a way to allow it to > list > > >> them > > >> > in > > >> > >> > the order which I want? > > >> > >> > Currently, the display is quite random, and I can get a field > in > > >> > >> > collection1, followed by a field in collection3, then > collection1 > > >> > again, > > >> > >> > and then collection2. > > >> > >> > > > >> > >> > It will be good if we can arrange the field to display in the > > order > > >> > >> that we > > >> > >> > want. > > >> > >> > > > >> > >> > Regards, > > >> > >> > Edwin > > >> > >> > > > >> > >> > > > >> > >> > > > >> > >> > On 4 May 2017 at 09:56, Zheng Lin Edwin Yeo < > > edwinye...@gmail.com> > > >> > >> wrote: > > >> > >> > > > >> > >> > > Hi Joel, > > >> > >> > > > > >> > >> > > It works when I started off with just one expression. > > >> > >> > > > > >> > >> > > Could it be that the data size is too big for export after > the > > >> join, > > >> > >> > which > > >> > >> > > causes the error? > > >> > >> > > > > >> > >> > > Regards, > > >> > >> > > Edwin > > >> > >> > > > > >> > >> > > On 4 May 2017 at 02:53, Joel Bernstein <joels...@gmail.com> > > >> wrote: > > >> > >> > > > > >> > >> > >> I was just testing with the query below and it worked for > me. > > >> Some > > >> > of > > >> > >> > the > > >> > >> > >> error messages I was getting with the syntax was not what I > > was > > >> > >> > expecting > > >> > >> > >> though, so I'll look into the error handling. But the joins > do > > >> work > > >> > >> when > > >> > >> > >> the syntax correct. The query below is joining to the same > > >> > collection > > >> > >> > >> three > > >> > >> > >> times, but the mechanics are exactly the same joining three > > >> > different > > >> > >> > >> tables. In this example each join narrows down the result > set. > > >> > >> > >> > > >> > >> > >> hashJoin(parallel(collection2, > > >> > >> > >> workers=3, > > >> > >> > >> sort="id asc", > > >> > >> > >> innerJoin(search(collection2, > > >> q="*:*", > > >> > >> > >> fl="id", > > >> > >> > >> sort="id asc", qt="/export", partitionKeys="id"), > > >> > >> > >> > > search(collection2, > > >> > >> > >> q="year_i:42", fl="id, year_i", sort="id asc", qt="/export", > > >> > >> > >> partitionKeys="id"), > > >> > >> > >> on="id")), > > >> > >> > >> hashed=search(collection2, q="day_i:7", > > fl="id, > > >> > >> day_i", > > >> > >> > >> sort="id asc", qt="/export"), > > >> > >> > >> on="id") > > >> > >> > >> > > >> > >> > >> Joel Bernstein > > >> > >> > >> http://joelsolr.blogspot.com/ > > >> > >> > >> > > >> > >> > >> On Wed, May 3, 2017 at 1:29 PM, Joel Bernstein < > > >> joels...@gmail.com > > >> > > > > >> > >> > >> wrote: > > >> > >> > >> > > >> > >> > >> > Start off with just this expression: > > >> > >> > >> > > > >> > >> > >> > search(collection2, > > >> > >> > >> > q=*:*, > > >> > >> > >> > fl="a_s,b_s,c_s,d_s,e_s", > > >> > >> > >> > sort="a_s asc", > > >> > >> > >> > qt="/export") > > >> > >> > >> > > > >> > >> > >> > And then check the logs for exceptions. > > >> > >> > >> > > > >> > >> > >> > Joel Bernstein > > >> > >> > >> > http://joelsolr.blogspot.com/ > > >> > >> > >> > > > >> > >> > >> > On Wed, May 3, 2017 at 12:35 PM, Zheng Lin Edwin Yeo < > > >> > >> > >> edwinye...@gmail.com > > >> > >> > >> > > wrote: > > >> > >> > >> > > > >> > >> > >> >> Hi Joel, > > >> > >> > >> >> > > >> > >> > >> >> I am getting this error after I change add qt=/export and > > >> > removed > > >> > >> the > > >> > >> > >> rows > > >> > >> > >> >> param. Do you know what could be the reason? > > >> > >> > >> >> > > >> > >> > >> >> { > > >> > >> > >> >> "error":{ > > >> > >> > >> >> "metadata":[ > > >> > >> > >> >> "error-class","org.apache. > > solr.common.SolrException", > > >> > >> > >> >> "root-error-class","org.apache.http. > > >> > MalformedChunkCodingExc > > >> > >> e > > >> > >> > >> >> ption"], > > >> > >> > >> >> "msg":"org.apache.http. > MalformedChunkCodingException: > > >> CRLF > > >> > >> > >> expected > > >> > >> > >> >> at > > >> > >> > >> >> end of chunk", > > >> > >> > >> >> "trace":"org.apache.solr.common.SolrException: > > >> > >> > >> >> org.apache.http.MalformedChunkCodingException: CRLF > > >> expected at > > >> > >> end > > >> > >> > of > > >> > >> > >> >> chunk\r\n\tat > > >> > >> > >> >> org.apache.solr.client.solrj. > io.stream.TupleStream.lambda$ > > wr > > >> > >> > >> >> iteMap$0(TupleStream.java:79)\r\n\tat > > >> > >> > >> >> org.apache.solr.response.JSONWriter.writeIterator( > > JSONRespon > > >> > >> > >> >> seWriter.java:523)\r\n\tat > > >> > >> > >> >> org.apache.solr.response.TextResponseWriter.writeVal( > > TextRes > > >> > >> > >> >> ponseWriter.java:175)\r\n\tat > > >> > >> > >> >> org.apache.solr.response.JSONWriter$2.put( > > JSONResponseWriter > > >> > >> > >> >> .java:559)\r\n\tat > > >> > >> > >> >> org.apache.solr.client.solrj.io.stream.TupleStream. > > writeMap( > > >> > >> > >> >> TupleStream.java:64)\r\n\tat > > >> > >> > >> >> org.apache.solr.response.JSONWriter.writeMap( > > JSONResponseWri > > >> > >> > >> >> ter.java:547)\r\n\tat > > >> > >> > >> >> org.apache.solr.response.TextResponseWriter.writeVal( > > TextRes > > >> > >> > >> >> ponseWriter.java:193)\r\n\tat > > >> > >> > >> >> org.apache.solr.response.JSONWriter. > > writeNamedListAsMapWithD > > >> > >> > >> >> ups(JSONResponseWriter.java:209)\r\n\tat > > >> > >> > >> >> org.apache.solr.response.JSONWriter.writeNamedList( > > JSONRespo > > >> > >> > >> >> nseWriter.java:325)\r\n\tat > > >> > >> > >> >> org.apache.solr.response.JSONWriter.writeResponse( > > JSONRespon > > >> > >> > >> >> seWriter.java:120)\r\n\tat > > >> > >> > >> >> org.apache.solr.response.JSONResponseWriter.write( > > JSONRespon > > >> > >> > >> >> seWriter.java:71)\r\n\tat > > >> > >> > >> >> org.apache.solr.response.QueryResponseWriterUtil. > > writeQueryR > > >> > >> > >> >> esponse(QueryResponseWriterUtil.java:65)\r\n\tat > > >> > >> > >> >> org.apache.solr.servlet.HttpSolrCall.writeResponse( > > HttpSolrC > > >> > >> > >> >> all.java:732)\r\n\tat > > >> > >> > >> >> org.apache.solr.servlet.HttpSolrCall.call( > > HttpSolrCall.java: > > >> > >> > >> 473)\r\n\tat > > >> > >> > >> >> org.apache.solr.servlet.SolrDispatchFilter.doFilter( > > SolrDisp > > >> > >> > >> >> atchFilter.java:345)\r\n\tat > > >> > >> > >> >> org.apache.solr.servlet.SolrDispatchFilter.doFilter( > > SolrDisp > > >> > >> > >> >> atchFilter.java:296)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.servlet.ServletHandler$CachedChain. > > doFilte > > >> > >> > >> >> r(ServletHandler.java:1691)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.servlet.ServletHandler.doHandle( > > ServletHan > > >> > >> > >> >> dler.java:582)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.server.handler.ScopedHandler.handle( > > Scoped > > >> > >> > >> >> Handler.java:143)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.security.SecurityHandler.handle( > > SecurityHa > > >> > >> > >> >> ndler.java:548)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.server.session.SessionHandler. > doHandle( > > >> > >> > >> >> SessionHandler.java:226)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.server.handler.ContextHandler. > doHandle( > > >> > >> > >> >> ContextHandler.java:1180)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.servlet.ServletHandler.doScope( > > ServletHand > > >> > >> > >> >> ler.java:512)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.server.session.SessionHandler.doScope( > > >> > >> > >> >> SessionHandler.java:185)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.server.handler.ContextHandler.doScope( > > >> > >> > >> >> ContextHandler.java:1112)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.server.handler.ScopedHandler.handle( > > Scoped > > >> > >> > >> >> Handler.java:141)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.server.handler. > > ContextHandlerCollection.ha > > >> > >> > >> >> ndle(ContextHandlerCollection.java:213)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.server.handler.HandlerCollection. > handle( > > >> > >> > >> >> HandlerCollection.java:119)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.server.handler.HandlerWrapper.handle( > > Handl > > >> > >> > >> >> erWrapper.java:134)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.server.Server.handle(Server.java:534) > > \r\n\ > > >> tat > > >> > >> > >> >> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel. > > >> > >> > >> java:320)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.server.HttpConnection.onFillable( > > HttpConne > > >> > >> > >> >> ction.java:251)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.io.AbstractConnection$ReadCallback. > > >> > >> > >> >> succeeded(AbstractConnection.java:273)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.io.FillInterest.fillable(FillInterest. > > >> > >> > >> java:95)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.io.SelectChannelEndPoint$2.run( > > SelectChann > > >> > >> > >> >> elEndPoint.java:93)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.util.thread. > > strategy.ExecuteProduceConsume > > >> > >> > >> >> .executeProduceConsume(ExecuteProduceConsume.java: > > 303)\r\n\ > > >> tat > > >> > >> > >> >> org.eclipse.jetty.util.thread. > > strategy.ExecuteProduceConsume > > >> > >> > >> >> .produceConsume(ExecuteProduceConsume.java:148)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.util.thread. > > strategy.ExecuteProduceConsume > > >> > >> > >> >> .run(ExecuteProduceConsume.java:136)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.util.thread. > > QueuedThreadPool.runJob(Queued > > >> > >> > >> >> ThreadPool.java:671)\r\n\tat > > >> > >> > >> >> org.eclipse.jetty.util.thread. > > QueuedThreadPool$2.run(QueuedT > > >> > >> > >> >> hreadPool.java:589)\r\n\tat > > >> > >> > >> >> java.lang.Thread.run(Thread.java:745)\r\nCaused by: > > >> > >> > >> >> org.apache.http.MalformedChunkCodingException: CRLF > > >> expected at > > >> > >> end > > >> > >> > of > > >> > >> > >> >> chunk\r\n\tat > > >> > >> > >> >> org.apache.http.impl.io.ChunkedInputStream. > > getChunkSize(Chun > > >> > >> > >> >> kedInputStream.java:255)\r\n\tat > > >> > >> > >> >> org.apache.http.impl.io.ChunkedInputStream.nextChunk( > > Chunked > > >> > >> > >> >> InputStream.java:227)\r\n\tat > > >> > >> > >> >> org.apache.http.impl.io.ChunkedInputStream.read( > > ChunkedInput > > >> > >> > >> >> Stream.java:186)\r\n\tat > > >> > >> > >> >> org.apache.http.impl.io.ChunkedInputStream.read( > > ChunkedInput > > >> > >> > >> >> Stream.java:215)\r\n\tat > > >> > >> > >> >> org.apache.http.impl.io.ChunkedInputStream.close( > > ChunkedInpu > > >> > >> > >> >> tStream.java:316)\r\n\tat > > >> > >> > >> >> org.apache.http.conn.BasicManagedEntity. > > streamClosed(BasicMa > > >> > >> > >> >> nagedEntity.java:164)\r\n\tat > > >> > >> > >> >> org.apache.http.conn.EofSensorInputStream. > > checkClose(EofSens > > >> > >> > >> >> orInputStream.java:228)\r\n\tat > > >> > >> > >> >> org.apache.http.conn.EofSensorInputStream.close( > > EofSensorInp > > >> > >> > >> >> utStream.java:174)\r\n\tat > > >> > >> > >> >> sun.nio.cs.StreamDecoder.implClose(StreamDecoder.java: > > 378)\ > > >> > >> r\n\tat > > >> > >> > >> >> sun.nio.cs.StreamDecoder.close(StreamDecoder.java:193)\ > > r\n\ > > >> tat > > >> > >> > >> >> java.io.InputStreamReader.close(InputStreamReader.java: > > 199)\ > > >> > >> r\n\tat > > >> > >> > >> >> org.apache.solr.client.solrj.io.stream.JSONTupleStream. > > close > > >> > >> > >> >> (JSONTupleStream.java:92)\r\n\tat > > >> > >> > >> >> org.apache.solr.client.solrj.io.stream.SolrStream.close( > > Solr > > >> > >> > >> >> Stream.java:193)\r\n\tat > > >> > >> > >> >> org.apache.solr.client.solrj.io.stream.CloudSolrStream. > > close > > >> > >> > >> >> (CloudSolrStream.java:464)\r\n\tat > > >> > >> > >> >> org.apache.solr.client.solrj.io.stream.HashJoinStream. > > close( > > >> > >> > >> >> HashJoinStream.java:231)\r\n\tat > > >> > >> > >> >> org.apache.solr.client.solrj.io.stream.ExceptionStream. > > close > > >> > >> > >> >> (ExceptionStream.java:93)\r\n\tat > > >> > >> > >> >> org.apache.solr.handler.StreamHandler$TimerStream.close( > > >> > >> > >> >> StreamHandler.java:452)\r\n\tat > > >> > >> > >> >> org.apache.solr.client.solrj. > io.stream.TupleStream.lambda$ > > wr > > >> > >> > >> >> iteMap$0(TupleStream.java:71)\r\n\t... > > >> > >> > >> >> 40 more\r\n", > > >> > >> > >> >> "code":500}} > > >> > >> > >> >> > > >> > >> > >> >> > > >> > >> > >> >> Regards, > > >> > >> > >> >> Edwin > > >> > >> > >> >> > > >> > >> > >> >> > > >> > >> > >> >> On 4 May 2017 at 00:00, Joel Bernstein < > joels...@gmail.com > > > > > >> > >> wrote: > > >> > >> > >> >> > > >> > >> > >> >> > I've reformatted the expression below and made a few > > >> changes. > > >> > >> You > > >> > >> > >> have > > >> > >> > >> >> put > > >> > >> > >> >> > things together properly. But these are MapReduce joins > > >> that > > >> > >> > require > > >> > >> > >> >> > exporting the entire result sets. So you will need to > add > > >> > >> > qt=/export > > >> > >> > >> to > > >> > >> > >> >> all > > >> > >> > >> >> > the searches and remove the rows param. In Solr 6.6. > > there > > >> is > > >> > a > > >> > >> new > > >> > >> > >> >> > "shuffle" expression that does this automatically. > > >> > >> > >> >> > > > >> > >> > >> >> > To test things you'll want to break down each > expression > > >> and > > >> > >> make > > >> > >> > >> sure > > >> > >> > >> >> it's > > >> > >> > >> >> > behaving as expected. > > >> > >> > >> >> > > > >> > >> > >> >> > For example first run each search. Then run the > > innerJoin, > > >> not > > >> > >> in > > >> > >> > >> >> parallel > > >> > >> > >> >> > mode. Then run it in parallel mode. Then try the whole > > >> thing. > > >> > >> > >> >> > > > >> > >> > >> >> > hashJoin(parallel(collection2, > > >> > >> > >> >> > > innerJoin(search(collection2, > > >> > >> > >> >> > > > >> q=*:*, > > >> > >> > >> >> > > > >> > >> > >> >> > fl="a_s,b_s,c_s,d_s,e_s", > > >> > >> > >> >> > > > >> > sort="a_s > > >> > >> > >> asc", > > >> > >> > >> >> > > > >> > >> > >> >> partitionKeys="a_s", > > >> > >> > >> >> > > > >> > >> > qt="/export"), > > >> > >> > >> >> > > > >> search(collection1, > > >> > >> > >> >> > > > >> q=*:*, > > >> > >> > >> >> > > > >> > >> > >> >> > fl="a_s,f_s,g_s,h_s,i_s,j_s", > > >> > >> > >> >> > > > >> > sort="a_s > > >> > >> > >> asc", > > >> > >> > >> >> > > > >> > >> > >> >> partitionKeys="a_s", > > >> > >> > >> >> > > > >> > >> > qt="/export"), > > >> > >> > >> >> > on="a_s"), > > >> > >> > >> >> > workers="2", > > >> > >> > >> >> > sort="a_s asc"), > > >> > >> > >> >> > hashed=search(collection3, > > >> > >> > >> >> > q=*:*, > > >> > >> > >> >> > > > fl="a_s,k_s,l_s", > > >> > >> > >> >> > sort="a_s > asc", > > >> > >> > >> >> > qt="/export"), > > >> > >> > >> >> > on="a_s") > > >> > >> > >> >> > > > >> > >> > >> >> > Joel Bernstein > > >> > >> > >> >> > http://joelsolr.blogspot.com/ > > >> > >> > >> >> > > > >> > >> > >> >> > On Wed, May 3, 2017 at 11:26 AM, Zheng Lin Edwin Yeo < > > >> > >> > >> >> edwinye...@gmail.com > > >> > >> > >> >> > > > > >> > >> > >> >> > wrote: > > >> > >> > >> >> > > > >> > >> > >> >> > > Hi Joel, > > >> > >> > >> >> > > > > >> > >> > >> >> > > Thanks for the clarification. > > >> > >> > >> >> > > > > >> > >> > >> >> > > Would like to check, is this the correct way to do > the > > >> join? > > >> > >> > >> >> Currently, I > > >> > >> > >> >> > > could not get any results after putting in the > hashJoin > > >> for > > >> > >> the > > >> > >> > >> 3rd, > > >> > >> > >> >> > > smallerStream collection (collection3). > > >> > >> > >> >> > > > > >> > >> > >> >> > > http://localhost:8983/solr/collection1/stream?expr= > > >> > >> > >> >> > > hashJoin(parallel(collection2 > > >> > >> > >> >> > > , > > >> > >> > >> >> > > innerJoin( > > >> > >> > >> >> > > search(collection2, > > >> > >> > >> >> > > q=*:*, > > >> > >> > >> >> > > fl="a_s,b_s,c_s,d_s,e_s", > > >> > >> > >> >> > > sort="a_s asc", > > >> > >> > >> >> > > partitionKeys="a_s", > > >> > >> > >> >> > > rows=200), > > >> > >> > >> >> > > search(collection1, > > >> > >> > >> >> > > q=*:*, > > >> > >> > >> >> > > fl="a_s,f_s,g_s,h_s,i_s,j_s", > > >> > >> > >> >> > > sort="a_s asc", > > >> > >> > >> >> > > partitionKeys="a_s", > > >> > >> > >> >> > > rows=200), > > >> > >> > >> >> > > on="a_s"), > > >> > >> > >> >> > > workers="2", > > >> > >> > >> >> > > sort="a_s asc"), > > >> > >> > >> >> > > hashed=search(collection3, > > >> > >> > >> >> > > q=*:*, > > >> > >> > >> >> > > fl="a_s,k_s,l_s", > > >> > >> > >> >> > > sort="a_s asc", > > >> > >> > >> >> > > rows=200), > > >> > >> > >> >> > > on="a_s") > > >> > >> > >> >> > > &indent=true > > >> > >> > >> >> > > > > >> > >> > >> >> > > > > >> > >> > >> >> > > Regards, > > >> > >> > >> >> > > Edwin > > >> > >> > >> >> > > > > >> > >> > >> >> > > > > >> > >> > >> >> > > On 3 May 2017 at 20:59, Joel Bernstein < > > >> joels...@gmail.com> > > >> > >> > wrote: > > >> > >> > >> >> > > > > >> > >> > >> >> > > > Sorry, it's just called hashJoin > > >> > >> > >> >> > > > > > >> > >> > >> >> > > > Joel Bernstein > > >> > >> > >> >> > > > http://joelsolr.blogspot.com/ > > >> > >> > >> >> > > > > > >> > >> > >> >> > > > On Wed, May 3, 2017 at 2:45 AM, Zheng Lin Edwin > Yeo < > > >> > >> > >> >> > > edwinye...@gmail.com> > > >> > >> > >> >> > > > wrote: > > >> > >> > >> >> > > > > > >> > >> > >> >> > > > > Hi Joel, > > >> > >> > >> >> > > > > > > >> > >> > >> >> > > > > I am getting this error when I used the > > >> innerHashJoin. > > >> > >> > >> >> > > > > > > >> > >> > >> >> > > > > "EXCEPTION":"Invalid stream expression > > >> > >> > innerHashJoin(parallel( > > >> > >> > >> >> > > innerJoin > > >> > >> > >> >> > > > > > > >> > >> > >> >> > > > > I also can't find the documentation on > > innerHashJoin > > >> for > > >> > >> the > > >> > >> > >> >> > Streaming > > >> > >> > >> >> > > > > Expressions. > > >> > >> > >> >> > > > > > > >> > >> > >> >> > > > > Are you referring to hashJoin? > > >> > >> > >> >> > > > > > > >> > >> > >> >> > > > > Regards, > > >> > >> > >> >> > > > > Edwin > > >> > >> > >> >> > > > > > > >> > >> > >> >> > > > > > > >> > >> > >> >> > > > > On 3 May 2017 at 13:20, Zheng Lin Edwin Yeo < > > >> > >> > >> edwinye...@gmail.com > > >> > >> > >> >> > > > >> > >> > >> >> > > > wrote: > > >> > >> > >> >> > > > > > > >> > >> > >> >> > > > > > Hi Joel, > > >> > >> > >> >> > > > > > > > >> > >> > >> >> > > > > > Thanks for the info. > > >> > >> > >> >> > > > > > > > >> > >> > >> >> > > > > > Regards, > > >> > >> > >> >> > > > > > Edwin > > >> > >> > >> >> > > > > > > > >> > >> > >> >> > > > > > > > >> > >> > >> >> > > > > > On 3 May 2017 at 02:04, Joel Bernstein < > > >> > >> joels...@gmail.com > > >> > >> > > > > >> > >> > >> >> wrote: > > >> > >> > >> >> > > > > > > > >> > >> > >> >> > > > > >> Also take a look at the documentation for the > > >> "fetch" > > >> > >> > >> streaming > > >> > >> > >> >> > > > > >> expression. > > >> > >> > >> >> > > > > >> > > >> > >> > >> >> > > > > >> Joel Bernstein > > >> > >> > >> >> > > > > >> http://joelsolr.blogspot.com/ > > >> > >> > >> >> > > > > >> > > >> > >> > >> >> > > > > >> On Tue, May 2, 2017 at 2:03 PM, Joel > Bernstein < > > >> > >> > >> >> > joels...@gmail.com> > > >> > >> > >> >> > > > > >> wrote: > > >> > >> > >> >> > > > > >> > > >> > >> > >> >> > > > > >> > Yes you join more then one collection with > > >> > Streaming > > >> > >> > >> >> > Expressions. > > >> > >> > >> >> > > > Here > > >> > >> > >> >> > > > > >> are > > >> > >> > >> >> > > > > >> > a few things to keep in mind. > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > * You'll likely want to use the parallel > > >> function > > >> > >> around > > >> > >> > >> the > > >> > >> > >> >> > > largest > > >> > >> > >> >> > > > > >> join. > > >> > >> > >> >> > > > > >> > You'll need to use the join keys as the > > >> > >> partitionKeys. > > >> > >> > >> >> > > > > >> > * innerJoin: requires that the streams be > > >> sorted on > > >> > >> the > > >> > >> > >> join > > >> > >> > >> >> > keys. > > >> > >> > >> >> > > > > >> > * innerHashJoin: has no sorting requirement. > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > So a strategy for a three collection join > > might > > >> > look > > >> > >> > like > > >> > >> > >> >> this: > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > innerHashJoin(parallel(innerJoin(bigStream, > > >> > >> > bigStream)), > > >> > >> > >> >> > > > > smallerStream) > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > The largest join can be done in parallel > using > > >> an > > >> > >> > >> innerJoin. > > >> > >> > >> >> You > > >> > >> > >> >> > > can > > >> > >> > >> >> > > > > >> then > > >> > >> > >> >> > > > > >> > wrap the stream coming out of the parallel > > >> function > > >> > >> in > > >> > >> > an > > >> > >> > >> >> > > > > innerHashJoin > > >> > >> > >> >> > > > > >> to > > >> > >> > >> >> > > > > >> > join it to another stream. > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > Joel Bernstein > > >> > >> > >> >> > > > > >> > http://joelsolr.blogspot.com/ > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > On Mon, May 1, 2017 at 9:42 PM, Zheng Lin > > Edwin > > >> > Yeo < > > >> > >> > >> >> > > > > >> edwinye...@gmail.com> > > >> > >> > >> >> > > > > >> > wrote: > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> >> Hi, > > >> > >> > >> >> > > > > >> >> > > >> > >> > >> >> > > > > >> >> Is it possible to join more than 2 > > collections > > >> > using > > >> > >> > one > > >> > >> > >> of > > >> > >> > >> >> the > > >> > >> > >> >> > > > > >> streaming > > >> > >> > >> >> > > > > >> >> expressions (Eg: innerJoin)? If not, is > there > > >> > other > > >> > >> > ways > > >> > >> > >> we > > >> > >> > >> >> can > > >> > >> > >> >> > > do > > >> > >> > >> >> > > > > it? > > >> > >> > >> >> > > > > >> >> > > >> > >> > >> >> > > > > >> >> Currently, I may need to join 3 or 4 > > >> collections > > >> > >> > >> together, > > >> > >> > >> >> and > > >> > >> > >> >> > to > > >> > >> > >> >> > > > > >> output > > >> > >> > >> >> > > > > >> >> selected fields from all these collections > > >> > together. > > >> > >> > >> >> > > > > >> >> > > >> > >> > >> >> > > > > >> >> I'm using Solr 6.4.2. > > >> > >> > >> >> > > > > >> >> > > >> > >> > >> >> > > > > >> >> Regards, > > >> > >> > >> >> > > > > >> >> Edwin > > >> > >> > >> >> > > > > >> >> > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > > > >> > >> > >> >> > > > > >> > > >> > >> > >> >> > > > > > > > >> > >> > >> >> > > > > > > > >> > >> > >> >> > > > > > > >> > >> > >> >> > > > > > >> > >> > >> >> > > > > >> > >> > >> >> > > > >> > >> > >> >> > > >> > >> > >> > > > >> > >> > >> > > > >> > >> > >> > > >> > >> > > > > >> > >> > > > > >> > >> > > > >> > >> > > >> > > > > >> > > > > >> > > > >> > > > > > > > > >