Hello Developers, I just want to ask don't you think that response streaming can be useful for things like OLAP, e.g. is you have sharded index presorted and pre-joined by BJQ way you can calculate counts in many cube cells in parallel? Essential distributed test for response streaming just passed. https://github.com/m-khl/solr-patches/blob/ec4db7c0422a5515392a7019c5bd23ad3f546e4b/solr/core/src/test/org/apache/solr/response/RespStreamDistributedTest.java
branch is https://github.com/m-khl/solr-patches/tree/streaming Regards On Mon, Apr 2, 2012 at 10:55 AM, Mikhail Khludnev < mkhlud...@griddynamics.com> wrote: > > Hello, > > Small update - reading streamed response is done via callback. No > SolrDocumentList in memory. > https://github.com/m-khl/solr-patches/tree/streaming > here is the test > https://github.com/m-khl/solr-patches/blob/d028d4fabe0c20cb23f16098637e2961e9e2366e/solr/core/src/test/org/apache/solr/response/ResponseStreamingTest.java#L138 > > no progress in distributed search via streaming yet. > > Pls let me know if you don't want to have updates from my playground. > > Regards > > > On Thu, Mar 29, 2012 at 1:02 PM, Mikhail Khludnev < > mkhlud...@griddynamics.com> wrote: > >> @All >> Why nobody desires such a pretty cool feature? >> >> Nicholas, >> I have a tiny progress: I'm able to stream in javabin codec format while >> searching, It implies sorting by _docid_ >> >> here is the diff >> >> https://github.com/m-khl/solr-patches/commit/2f9ff068c379b3008bb983d0df69dff714ddde95 >> >> The current issue is that reading response by SolrJ is done as whole. >> Reading by callback is supported by EmbeddedServer only. Anyway it should >> not a big deal. ResponseStreamingTest.java somehow works. >> I'm stuck on introducing response streaming in distributes search, it's >> actually more challenging - RespStreamDistributedTest fails >> >> Regards >> >> >> On Fri, Mar 16, 2012 at 3:51 PM, Nicholas Ball <nicholas.b...@nodelay.com >> > wrote: >> >>> >>> Mikhail & Ludovic, >>> >>> Thanks for both your replies, very helpful indeed! >>> >>> Ludovic, I was actually looking into just that and did some tests with >>> SolrJ, it does work well but needs some changes on the Solr server if we >>> want to send out individual documents a various times. This could be done >>> with a write() and flush() to the FastOutputStream (daos) in >>> JavBinCodec. I >>> therefore think that a combination of this and Mikhail's solution would >>> work best! >>> >>> Mikhail, you mention that your solution doesn't currently work and not >>> sure why this is the case, but could it be that you haven't flushed the >>> data (os.flush()) you've written in the collect method of >>> DocSetStreamer? I >>> think placing the output stream into the SolrQueryRequest is the way to >>> go, >>> so that we can access it and write to it how we intend. However, I think >>> using the JavaBinCodec would be ideal so that we can work with SolrJ >>> directly, and not mess around with the encoding of the docs/data etc... >>> >>> At the moment the entry point to JavaBinCodec is through the >>> BinaryResponseWriter which calls the highest level marshal() method which >>> decodes and sends out the entire SolrQueryResponse (line 49 @ >>> BinaryResponseWriter). What would be ideal is to be able to break up the >>> response and call the JavaBinCodec for pieces of it with a flush after >>> each >>> call. Did a few tests with a simple Thread.sleep and a flush to see if >>> this >>> would actually work and looks like it's working out perfectly. Just >>> trying >>> to figure out the best way to actually do it now :) any ideas? >>> >>> An another note, for a solution to work with the chunked transfer >>> encoding >>> (and therefore web browsers), a lot more development is going to be >>> needed. >>> Not sure if it's worth trying yet but might look into it later down the >>> line. >>> >>> Nick >>> >>> On Fri, 16 Mar 2012 07:29:20 +0300, Mikhail Khludnev >>> <mkhlud...@griddynamics.com> wrote: >>> > Ludovic, >>> > >>> > I looked through. First of all, it seems to me you don't amend regular >>> > "servlet" solr server, but the only embedded one. >>> > Anyway, the difference is that you stream DocList via callback, but it >>> > means that you've instantiated it in memory and keep it there until it >>> will >>> > be completely consumed. Think about a billion numfound. Core idea of my >>> > approach is keep almost zero memory for response. >>> > >>> > Regards >>> > >>> > On Fri, Mar 16, 2012 at 12:12 AM, lboutros <boutr...@gmail.com> wrote: >>> > >>> >> Hi, >>> >> >>> >> I was looking for something similar. >>> >> >>> >> I tried this patch : >>> >> >>> >> https://issues.apache.org/jira/browse/SOLR-2112 >>> >> >>> >> it's working quite well (I've back-ported the code in Solr 3.5.0...). >>> >> >>> >> Is it really different from what you are trying to achieve ? >>> >> >>> >> Ludovic. >>> >> >>> >> ----- >>> >> Jouve >>> >> France. >>> >> -- >>> >> View this message in context: >>> >> >>> >>> http://lucene.472066.n3.nabble.com/Responding-to-Requests-with-Chunks-Streaming-tp3827316p3829909.html >>> >> Sent from the Solr - User mailing list archive at Nabble.com. >>> >> >>> >> >> >> >> -- >> Sincerely yours >> Mikhail Khludnev >> ge...@yandex.ru >> >> <http://www.griddynamics.com> >> <mkhlud...@griddynamics.com> >> >> > > > -- > Sincerely yours > Mikhail Khludnev > ge...@yandex.ru > > <http://www.griddynamics.com> > <mkhlud...@griddynamics.com> > > -- Sincerely yours Mikhail Khludnev ge...@yandex.ru <http://www.griddynamics.com> <mkhlud...@griddynamics.com>