about solr reduce shard nums

2018-05-19 Thread 苗海泉
Hello everyone, I encountered a shard reduction problem with solr. My
current solr cluster is deployed in solrcloud mode. Now I need to use
several solr machines for other purposes. The solr version I use is Solr
6.0. What should I do? Do it, thank you for your help.--
==
联创科技
知行如一
==


Re: Setting up MiniSolrCloudCluster to use pre-built index

2018-05-19 Thread Mark Miller
You create MiniSolrCloudCluster with a base directory and then each Jetty
instance created gets a SolrHome in a subfolder called node{i}. So if
legacyCloud=true you can just preconfigure a core and index under the right
node{i} subfolder. legacyCloud=true should not even exist anymore though,
so the long term way to do this would be to create a collection and then
use the merge API or something to merge your index into the empty
collection.

 - Mark

On Sat, May 19, 2018 at 5:25 PM Ken Krugler 
wrote:

> Hi all,
>
> Wondering if anyone has experience (this is with Solr 6.6) in setting up
> MiniSolrCloudCluster for unit testing, where we want to use an existing
> index.
>
> Note that this index wasn’t built with SolrCloud, as it’s generated by a
> distributed (Hadoop) workflow.
>
> So there’s no “restore from backup” option, or swapping collection
> aliases, etc.
>
> We can push our configset to Zookeeper and create the collection as per
> other unit tests in Solr, but what’s the right way to set up data dirs for
> the cores such that Solr is running with this existing index (or indexes,
> for our sharded test case)?
>
> Thanks!
>
> — Ken
>
> PS - yes, we’re aware of the routing issue with generating our own shards….
>
> --
> Ken Krugler
> +1 530-210-6378 <(530)%20210-6378>
> http://www.scaleunlimited.com
> Custom big data solutions & training
> Flink, Solr, Hadoop, Cascading & Cassandra
>
> --
- Mark
about.me/markrmiller


Setting up MiniSolrCloudCluster to use pre-built index

2018-05-19 Thread Ken Krugler
Hi all,

Wondering if anyone has experience (this is with Solr 6.6) in setting up 
MiniSolrCloudCluster for unit testing, where we want to use an existing index.

Note that this index wasn’t built with SolrCloud, as it’s generated by a 
distributed (Hadoop) workflow.

So there’s no “restore from backup” option, or swapping collection aliases, etc.

We can push our configset to Zookeeper and create the collection as per other 
unit tests in Solr, but what’s the right way to set up data dirs for the cores 
such that Solr is running with this existing index (or indexes, for our sharded 
test case)?

Thanks!

— Ken

PS - yes, we’re aware of the routing issue with generating our own shards….

--
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
Custom big data solutions & training
Flink, Solr, Hadoop, Cascading & Cassandra



Re: SOLR: Array Key to Value on Result

2018-05-19 Thread Erick Erickson
"The Solr Way" (tm) would be to flatten the data and store multiple
records, i.e.
id  key  lang lang_display
1_EN 1ENindia
1_TAM   1TAM \u0b87\u0ba8\u0bcd\u0ba4\u0bbf\u0baf\u0bbe

where lang was indexed but possibly not stored and lang_dislplay was
stored but possibly not indexed (or docValues or)

This can bloat the index if carried too far, since the number of
records becomes the cross product of all the different possibilities.
That said, until you get into the multiple 10s of millions it's
usually not a problem.

[subquery] is going to be somewhat expensive. You're right it'll only
execute on the docs returned, so if rows=10 the subqueries will be
performed only for those 10 docs. But that's 10 *
number_of_subquery_fields so could explode a bit.

Note I had to play a bit with the "id" field since it's akin to a
primary key in a database so must be different for each record you
want to have available.

Or, if you wanted to keep the record as your example, you might just
store {"EN":"India","TAM":"\u0b87\u0ba8\u0bcd\u0ba4\u0bbf\u0baf\u0bbe"}
in, say, a "languages" string field and have the front-end pull out
the right part based on the language desired using your favorite JSON
parser or whatever.

Best,
Erick

On Sat, May 19, 2018 at 9:15 AM, Doss  wrote:
> Hi,
>
> I found a work around for our requirement with the help of Transforming
> Result Documents.
>
> https://lucene.apache.org/solr/guide/7_3/transforming-result-documents.html#subquery
>
> I need insights about the performance impact (if any) this is going to
> create. I assume this transformation is happening after the results being
> obtained by the parent query, so there won't be much performance impact it
> will create, but
>
> we are going to use this functionality for a large and busy index, so before
> taking further steps I need expert opinion.
>
> Thanks,
> Doss.
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: SOLR: Array Key to Value on Result

2018-05-19 Thread Doss
Hi,

I found a work around for our requirement with the help of Transforming
Result Documents.

https://lucene.apache.org/solr/guide/7_3/transforming-result-documents.html#subquery

I need insights about the performance impact (if any) this is going to
create. I assume this transformation is happening after the results being
obtained by the parent query, so there won't be much performance impact it
will create, but 

we are going to use this functionality for a large and busy index, so before
taking further steps I need expert opinion. 

Thanks,
Doss.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html