I got it when you said form N queries. Just wanted to try the "get all
cursorMark first" approach but just realized it would be very inefficient
as you said since cursor mark is serialized version of the last sorted
value you received and hence still you are reading the results from solr
although
You're executing all the queries to parallelize before even starting.
Seems very inefficient. My suggestion doesn't require this first step.
Perhaps it was confusing because I mentioned "your own cursorMark".
Really I meant bypass that entirely, just form N queries that were
restricted to N
Thanks Joel for the explanation.
Hi Erick,
One of the ways I am trying to parallelize the cursor approach is by
iterating the result set twice.
(1) Once just to get all the cursor marks
val q: SolrQuery = new solrj.SolrQuery()
q.set("q", query)
q.add("fq", query)
q.add("rows",
Solr 5 was very early days for Streaming Expressions. Streaming Expressions
and SQL use Java 8 so development switched to the 6.0 branch five months
before the 6.0 release. So there was a very large jump in features and bug
fixes from Solr 5 to Solr 6 in Streaming Expressions.
Joel Bernstein
In Solr 5 the /export handler wasn't escaping json text fields, which would
produce json parse exceptions. This was fixed in Solr 6.0.
Joel Bernstein
http://joelsolr.blogspot.com/
On Tue, Nov 8, 2016 at 6:17 PM, Erick Erickson
wrote:
> Hmm, that should work fine. Let
Hmm, that should work fine. Let us know what the logs show if anything
because this is weird.
Best,
Erick
On Tue, Nov 8, 2016 at 1:00 PM, Chetas Joshi wrote:
> Hi Erick,
>
> This is how I use the streaming approach.
>
> Here is the solrconfig block.
>
>
>
>
Hi Erick,
This is how I use the streaming approach.
Here is the solrconfig block.
{!xport}
xsort
false
query
And here is the code in which SolrJ is being used.
String zkHost = args[0];
String collection = args[1];
Map props = new
Hmmm, export is supposed to handle 10s of million result sets. I know
of a situation where the Streaming Aggregation functionality back
ported to Solr 4.10 processes on that scale. So do you have any clue
what exactly is failing? Is there anything in the Solr logs?
_How_ are you using /export,
Thanks Yonik for the explanation.
Hi Erick,
I was using the /xport functionality. But it hasn't been stable (Solr
5.5.0). I started running into run time Exceptions (JSON parsing
exceptions) while reading the stream of Tuples. This started happening as
the size of my collection increased 3 times
Have you considered the /xport functionality?
On Fri, Nov 4, 2016 at 5:56 PM, Yonik Seeley wrote:
> No, you can't get cursor-marks ahead of time.
> They are the serialized representation of the last sort values
> encountered (hence not known ahead of time).
>
> -Yonik
>
>
> On
No, you can't get cursor-marks ahead of time.
They are the serialized representation of the last sort values
encountered (hence not known ahead of time).
-Yonik
On Fri, Nov 4, 2016 at 8:48 PM, Chetas Joshi wrote:
> Hi,
>
> I am using the cursor approach to fetch results
Hi,
I am using the cursor approach to fetch results from Solr (5.5.0). Most of
my queries return millions of results. Is there a way I can read the pages
in parallel? Is there a way I can get all the cursors well in advance?
Let's say my query returns 2M documents and I have set rows=100,000.
12 matches
Mail list logo