Re: language specific fields of text

2013-01-07 Thread AlexeyK
You should use language detection processor factory, like below:

processor
class=org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory
 str name=langid.flcontent/str
 str name=langid.langFieldlanguage/str
 str name=langid.fallbacken/str
 *str name=langid.maptrue/str
 str name=langid.map.flcontent,fullname/str*
str name=langid.map.keepOrigtrue/str
str name=langid.whitelisten,fr,de,es,ru,it/str
 str name=langid.threshold0.7/str
   /processor

Once you have defined fields like content_en, content_fr etc., they will be
filled in automatically according to the recognized language

See http://wiki.apache.org/solr/LanguageDetection



--
View this message in context: 
http://lucene.472066.n3.nabble.com/language-specific-fields-of-text-tp3698985p4031180.html
Sent from the Solr - User mailing list archive at Nabble.com.


Getting Lucense Query from Solr query (Or converting Solr Query to Lucense's query)

2013-01-07 Thread Sabeer Hussain
Is there a way to get Lucene's query from Solr query?. I have a requirement
to search for terms in multiple heterogeneous indices. Presently, I am using
the following approach

try {
Directory directory1 = FSDirectory.open(new
File(E:\\database\\patient\\index));
Directory directory2 = FSDirectory.open(new
File(E:\\database\\study\\index));

BooleanQuery myQuery = new BooleanQuery();
myQuery.add(new TermQuery(new Term(PATIENT_GENDER, 
Male)),
BooleanClause.Occur.SHOULD);
myQuery.add(new TermQuery(new 
Term(STUDY_DIVISION,Cancer Center)),
BooleanClause.Occur.SHOULD);

int indexCount = 2;
IndexReader[] indexReader = new IndexReader[indexCount];
indexReader[0] = DirectoryReader.open(directory1);
indexReader[1] = DirectoryReader.open(directory2);

IndexSearcher searcher = new IndexSearcher(new 
MultiReader(indexReader));   
TopDocs col  = searcher.search(myQuery, 10);

//results
ScoreDoc[] docs =  col.scoreDocs;

} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

Here, I need to create TermQuery based on Field Names and its value. If I
can get this boolean query directly from Solr query q=PATIENT_GENDER:Male OR
STUDY_DIVISION:Cancer Center, that will save my coding effort. This one is
a simple example but when we need to create more complex query it will be a
time consuming activity and error prone. So, is there a way to get the
lucense's query from solr query.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Getting-Lucense-Query-from-Solr-query-Or-converting-Solr-Query-to-Lucense-s-query-tp4031187.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: theory of sets (first solution)

2013-01-07 Thread Uwe Reh

Hi,

I found a own hack. It's based on free interpretation of the function 
strdist().


Have:
- one multivalued field 'part_of'
- one unique field 'groupsort'

Index each item:
   For each group membership:
  add groupid to 'part_of'
  concat groupid and sortstring  to new string
  add this string to a csv list
   End
   add the csv list to 'groupsort'
End

Have also, a own class that implements 
org.apache.lucene.search.spell.StringDistance, to generate a custom 
distance value. This class should:

- split the csv list
- find the element/string that starts with the given group id
- translate the rest (sortstring) to a float value

.../select?q=part_of:Xsort=strdist(X, groupsort, FQN) asc
FQN is the fully qualified name of the own class. (remember to place the 
the jar in a 'lib' defined in solrconfig.xml or add a own 'lib' entry)


Uwe
(still looking for a smarter solution)



Re: Max number of core in Solr multi-core

2013-01-07 Thread Parvin Gasimzade
Thank you for your responses. I have one more question related to Solr
multi-core.
By using SolrJ I create new core for each application. When user wants to
add data or make query on his application, I create new HttpSolrServer for
this core. In this scenario there will be many running HttpSolrServer
instances.

Is there a better solution? Does it cause a problem to run many instances
at the same time?

On Wed, Jan 2, 2013 at 5:35 PM, Per Steffensen st...@designware.dk wrote:

 g a collection per application instead of a core


Re: Problem occured in solr cloud set up org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request

2013-01-07 Thread Erick Erickson
This is all quite strange, lots of people are using SolrCloud,
some with very large clusters, so I'm guessing it's something
in your setup that isn't obvious.

How certain are you that your network between the two
machines is reliable? And have you tried with a nightly build?

I'm grasping at straws because there's nothing obvious in what
you've told us so far, and I'm certain others aren't encountering
this problem...

Sorry I can't be more help,
Erick

On Sun, Jan 6, 2013 at 9:58 PM, yayati yayatirajpa...@gmail.com wrote:

 No live SolrServers


Re: Max number of core in Solr multi-core

2013-01-07 Thread Erick Erickson
This might help:
https://wiki.apache.org/solr/Solrj#HttpSolrServer

Note that the associated SolrRequest takes the path, I presume relative to
the base URL you initialized the HttpSolrServer with.

Best
Erick


On Mon, Jan 7, 2013 at 7:02 AM, Parvin Gasimzade parvin.gasimz...@gmail.com
 wrote:

 Thank you for your responses. I have one more question related to Solr
 multi-core.
 By using SolrJ I create new core for each application. When user wants to
 add data or make query on his application, I create new HttpSolrServer for
 this core. In this scenario there will be many running HttpSolrServer
 instances.

 Is there a better solution? Does it cause a problem to run many instances
 at the same time?

 On Wed, Jan 2, 2013 at 5:35 PM, Per Steffensen st...@designware.dk
 wrote:

  g a collection per application instead of a core



Re: custom solr sort

2013-01-07 Thread Uwe Reh

Am 06.01.2013 02:32, schrieb andy:

I want to custom solr sort and  pass solr param from client to solr server,


Hi Andy,

not a answer of your question, but maybe an other approach to solve your 
initial question. Instead of writing a new SearchComponent I decided to 
(miss)use the function http://wiki.apache.org/solr/FunctionQuery#strdist

'strdist' seems to have everything, you need:
- a parameter 's1'
- a fieldname 's2'
- a slot to plugin your own algo

How to use this to sort on multivalued attributes, I've described in 
this list as thread theory of sets


Uwe


Re: custom solr sort

2013-01-07 Thread Upayavira
Can you explain why you want to implement a different sort first? There
may be other ways of achieving the same thing.

Upayavira

On Sun, Jan 6, 2013, at 01:32 AM, andy wrote:
 Hi,
 
 Maybe this is an old thread or maybe it's different with previous one.
 
 I want to custom solr sort and  pass solr param from client to solr
 server,
 so I  implemented SearchComponent which named MySortComponent in my code,
 and also implemented FieldComparatorSource and FieldComparator. when I
 use
 mysearch requesthandler(see following codes), I found that custom sort
 just effect on the current page when I got multiple page results, but the
 sort is expected when I sets the rows which contains  all the results.
 Does
 anybody know how to solve it or the reason?
 
 code snippet:
 
 public class MySortComponent extends SearchComponent implements
 SolrCoreAware {
   
 public void inform(SolrCore arg0) {
 }
 
 @Override
 public void prepare(ResponseBuilder rb) throws IOException {
 SolrParams params = rb.req.getParams();
   String uid = params.get(uid)
   private RestTemplate restTemplate = new RestTemplate();
   
 MyComparatorSource comparator = new MyComparatorSource(uid);
 SortSpec sortSpec = rb.getSortSpec();
 if (sortSpec.getSort() == null) {
 sortSpec.setSort(new Sort(new SortField[] {
 new SortField(relation,
 comparator),SortField.FIELD_SCORE }));
   
 } else {
   
 SortField[] current = sortSpec.getSort().getSort();
 ArrayListSortField sorts = new ArrayListSortField(
 current.length + 1);
 sorts.add(new SortField(relation, comparator));
 for (SortField sf : current) {
 sorts.add(sf);
 }
 sortSpec.setSort(new Sort(sorts.toArray(new
 SortField[sorts.size()])));
   
 }
 
 }
 
 @Override
 public void process(ResponseBuilder rb) throws IOException {
 
 }
 
 //
 -
 // SolrInfoMBean
 //
 -
 
 @Override
 public String getDescription() {
 return Custom Sorting;
 }
 
 @Override
 public String getSource() {
 return ;
 }
 
 @Override
 public URL[] getDocs() {
 try {
 return new URL[] { new URL(
 http://wiki.apache.org/solr/QueryComponent;) };
 } catch (MalformedURLException e) {
 throw new RuntimeException(e);
 }
 }
 
 public class MyComparatorSource extends FieldComparatorSource {
 private BitSet dg1;
 private BitSet dg2;
 private BitSet dg3;
 
 public MyComparatorSource(String uid) throws IOException {
 
 SearchResponse responseBody = restTemplate.postForObject(
 http://search.test.com/userid/search/; + uid, null,
 SearchResponse.class);
 
 String d1 = responseBody.getOneDe();
 String d2 = responseBody.getTwoDe();
 String d3 = responseBody.getThreeDe();
 
 if (StringUtils.hasLength(d1)) {
 byte[] bytes = Base64.decodeBase64(d1);
 dg1 = BitSetHelper.loadFromBzip2ByteArray(bytes);
 }
  
 if (StringUtils.hasLength(d2)) {
 byte[] bytes = Base64.decodeBase64(d2);
 dg2 = BitSetHelper.loadFromBzip2ByteArray(bytes);
 }

 if (StringUtils.hasLength(d3)) {
 byte[] bytes = Base64.decodeBase64(d3);
 dg3 = BitSetHelper.loadFromBzip2ByteArray(bytes);
 }

 }
 
 @Override
 public FieldComparator newComparator(String fieldname,
 final int numHits, int sortPos, boolean reversed)
 throws IOException {
 return new RelationComparator(fieldname, numHits);
 }
 
 class RelationComparator extends FieldComparator {
 private int[] uidDoc;
 private float[] values;
 private float bottom;
 String fieldName;
 
 public RelationComparator(String fieldName, int numHits)
 throws IOException {
 values = new float[numHits];
 this.fieldName = fieldName;
 }
 
 @Override
 public int compare(int slot1, int slot2) {
 if (values[slot1]  values[slot2])
 return -1;
 if (values[slot1]  values[slot2])
 return 1;
 return 0;
 }
 
 @Override
 public int compareBottom(int doc) throws IOException {
 float docDistance = getRelation(doc);

Re: Max number of core in Solr multi-core

2013-01-07 Thread Parvin Gasimzade
I know that but my question is different. Let me ask it in this way.

I have a solr with base url localhost:8998/solr and two solr core
as localhost:8998/solr/core1 and localhost:8998/solr/core2.

I have one baseSolr instance initialized as :
SolrServer server = new HttpSolrServer( url );

I have also create SolrServer's for each core as :
SolrServer core1 = new HttpSolrServer( url + /core1 );
SolrServer core2 = new HttpSolrServer( url + /core2 );

Since there are many cores, I have to initialize SolrServer as shown above.
Is there a way to create only one SolrServer with the base url and access
each core using it? If it is possible, then I don't need to create new
SolrServer for each core.

On Mon, Jan 7, 2013 at 2:39 PM, Erick Erickson erickerick...@gmail.comwrote:

 This might help:
 https://wiki.apache.org/solr/Solrj#HttpSolrServer

 Note that the associated SolrRequest takes the path, I presume relative to
 the base URL you initialized the HttpSolrServer with.

 Best
 Erick


 On Mon, Jan 7, 2013 at 7:02 AM, Parvin Gasimzade 
 parvin.gasimz...@gmail.com
  wrote:

  Thank you for your responses. I have one more question related to Solr
  multi-core.
  By using SolrJ I create new core for each application. When user wants to
  add data or make query on his application, I create new HttpSolrServer
 for
  this core. In this scenario there will be many running HttpSolrServer
  instances.
 
  Is there a better solution? Does it cause a problem to run many instances
  at the same time?
 
  On Wed, Jan 2, 2013 at 5:35 PM, Per Steffensen st...@designware.dk
  wrote:
 
   g a collection per application instead of a core
 



Re: Getting Lucense Query from Solr query (Or converting Solr Query to Lucense's query)

2013-01-07 Thread Roman Chyla
if you are inside solr, as it seems to be the case, you can do this

QParserPlugin qplug =
req.getCore().getQueryPlugin(LuceneQParserPlugin.NAME);
QParser parser =  qplug.createParser(PATIENT_GENDER:Male OR
STUDY_DIVISION:\Cancer Center\, null, req.getParams(), req);
Query q = parser.parse();

maybe there is a one-line call to get the parser from solr core, but i
can't find it now. Have a look at one of the subclasses of QParser

--roman

On Mon, Jan 7, 2013 at 4:27 AM, Sabeer Hussain shuss...@del.aithent.comwrote:

 Is there a way to get Lucene's query from Solr query?. I have a requirement
 to search for terms in multiple heterogeneous indices. Presently, I am
 using
 the following approach

 try {
 Directory directory1 = FSDirectory.open(new
 File(E:\\database\\patient\\index));
 Directory directory2 = FSDirectory.open(new
 File(E:\\database\\study\\index));

 BooleanQuery myQuery = new BooleanQuery();
 myQuery.add(new TermQuery(new
 Term(PATIENT_GENDER, Male)),
 BooleanClause.Occur.SHOULD);
 myQuery.add(new TermQuery(new
 Term(STUDY_DIVISION,Cancer Center)),
 BooleanClause.Occur.SHOULD);

 int indexCount = 2;
 IndexReader[] indexReader = new
 IndexReader[indexCount];
 indexReader[0] = DirectoryReader.open(directory1);
 indexReader[1] = DirectoryReader.open(directory2);

 IndexSearcher searcher = new IndexSearcher(new
 MultiReader(indexReader));
 TopDocs col  = searcher.search(myQuery, 10);

 //results
 ScoreDoc[] docs =  col.scoreDocs;

 } catch (IOException e) {
 // TODO Auto-generated catch block
 e.printStackTrace();
 }

 Here, I need to create TermQuery based on Field Names and its value. If I
 can get this boolean query directly from Solr query q=PATIENT_GENDER:Male
 OR
 STUDY_DIVISION:Cancer Center, that will save my coding effort. This one
 is
 a simple example but when we need to create more complex query it will be a
 time consuming activity and error prone. So, is there a way to get the
 lucense's query from solr query.




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Getting-Lucense-Query-from-Solr-query-Or-converting-Solr-Query-to-Lucense-s-query-tp4031187.html
 Sent from the Solr - User mailing list archive at Nabble.com.



RE: Max number of core in Solr multi-core

2013-01-07 Thread Jay Parashar
This is the exact approach we use in our multithreaded env. One server per
core. I think this is the recommended approach.

-Original Message-
From: Parvin Gasimzade [mailto:parvin.gasimz...@gmail.com] 
Sent: Monday, January 07, 2013 7:00 AM
To: solr-user@lucene.apache.org
Subject: Re: Max number of core in Solr multi-core

I know that but my question is different. Let me ask it in this way.

I have a solr with base url localhost:8998/solr and two solr core as
localhost:8998/solr/core1 and localhost:8998/solr/core2.

I have one baseSolr instance initialized as :
SolrServer server = new HttpSolrServer( url );

I have also create SolrServer's for each core as :
SolrServer core1 = new HttpSolrServer( url + /core1 ); SolrServer core2 =
new HttpSolrServer( url + /core2 );

Since there are many cores, I have to initialize SolrServer as shown above.
Is there a way to create only one SolrServer with the base url and access
each core using it? If it is possible, then I don't need to create new
SolrServer for each core.

On Mon, Jan 7, 2013 at 2:39 PM, Erick Erickson
erickerick...@gmail.comwrote:

 This might help:
 https://wiki.apache.org/solr/Solrj#HttpSolrServer

 Note that the associated SolrRequest takes the path, I presume 
 relative to the base URL you initialized the HttpSolrServer with.

 Best
 Erick


 On Mon, Jan 7, 2013 at 7:02 AM, Parvin Gasimzade  
 parvin.gasimz...@gmail.com
  wrote:

  Thank you for your responses. I have one more question related to 
  Solr multi-core.
  By using SolrJ I create new core for each application. When user 
  wants to add data or make query on his application, I create new 
  HttpSolrServer
 for
  this core. In this scenario there will be many running 
  HttpSolrServer instances.
 
  Is there a better solution? Does it cause a problem to run many 
  instances at the same time?
 
  On Wed, Jan 2, 2013 at 5:35 PM, Per Steffensen st...@designware.dk
  wrote:
 
   g a collection per application instead of a core
 




RE: RE: Max number of core in Solr multi-core

2013-01-07 Thread Darren Govoni

This should be clarified some. In the client API, SolrServer is represents a 
connection to a single server backend/endpoint and should be re-used where possible.

The approach being discussed is to have one client connection (represented by SolrServer class) per solr core, all residing in a single solr server (as is the case below, but not required). 


brbrbr--- Original Message ---
On 1/7/2013  08:06 AM Jay Parashar wrote:brThis is the exact approach we use 
in our multithreaded env. One server per
brcore. I think this is the recommended approach.
br
br-Original Message-
brFrom: Parvin Gasimzade [mailto:parvin.gasimz...@gmail.com] 
brSent: Monday, January 07, 2013 7:00 AM

brTo: solr-user@lucene.apache.org
brSubject: Re: Max number of core in Solr multi-core
br
brI know that but my question is different. Let me ask it in this way.
br
brI have a solr with base url localhost:8998/solr and two solr core as
brlocalhost:8998/solr/core1 and localhost:8998/solr/core2.
br
brI have one baseSolr instance initialized as :
brSolrServer server = new HttpSolrServer( url );
br
brI have also create SolrServer's for each core as :
brSolrServer core1 = new HttpSolrServer( url + /core1 ); SolrServer core2 =
brnew HttpSolrServer( url + /core2 );
br
brSince there are many cores, I have to initialize SolrServer as shown above.
brIs there a way to create only one SolrServer with the base url and access
breach core using it? If it is possible, then I don't need to create new
brSolrServer for each core.
br
brOn Mon, Jan 7, 2013 at 2:39 PM, Erick Erickson
brerickerick...@gmail.comwrote:
br
br This might help:
br https://wiki.apache.org/solr/Solrj#HttpSolrServer
br
br Note that the associated SolrRequest takes the path, I presume 
br relative to the base URL you initialized the HttpSolrServer with.

br
br Best
br Erick
br
br
br On Mon, Jan 7, 2013 at 7:02 AM, Parvin Gasimzade  
br parvin.gasimz...@gmail.com

br  wrote:
br
br  Thank you for your responses. I have one more question related to 
br  Solr multi-core.
br  By using SolrJ I create new core for each application. When user 
br  wants to add data or make query on his application, I create new 
br  HttpSolrServer

br for
br  this core. In this scenario there will be many running 
br  HttpSolrServer instances.

br 
br  Is there a better solution? Does it cause a problem to run many 
br  instances at the same time?

br 
br  On Wed, Jan 2, 2013 at 5:35 PM, Per Steffensen st...@designware.dk
br  wrote:
br 
br   g a collection per application instead of a core
br 
br
br
br


Re: Sorting on mutivalued fields still impossible?

2013-01-07 Thread Uwe Reh

Hi Jack,

thank you for the hint.
Since I have already a solrj client to do the preprocessing, mapping to 
sort fields isn't my problem. I will try to explain better in my reply 
to Erick.


Uwe
(Sorry late reaction)


Am 30.08.2012 16:04, schrieb Jack Krupansky:

You can also use a Field Mutating Update Processor to do a smart
copy of a multi-valued field to a sortable single-valued field.

See:
http://wiki.apache.org/solr/UpdateRequestProcessor#Field_Mutating_Update_Processors


Such as using the maximum value via MaxFieldValueUpdateProcessorFactory.

See:
http://lucene.apache.org/solr/api-4_0_0-BETA/org/apache/solr/update/processor/MaxFieldValueUpdateProcessorFactory.html


Which value of a multi-valued field do you wish to sort by?

-- Jack Krupansky




Re: Sorting on mutivalued fields still impossible?

2013-01-07 Thread Uwe Reh

Am 31.08.2012 13:35, schrieb Erick Erickson:

... what would the correct behavior
be for sorting on a multivalued field


Hi Erick,

in generally you are right, the question of multivalued fields is which 
value the reference is. But there are thousands of cases where this 
question is implicit answered. See my example ...sort=max(datefield) 
desc It is obvious, that the newest date should win. I see no 
reason why simple filters like max can't handle multivalued fields.


Now four month's later i still wounder, why there is no pluginable 
function to map multivalued fields into a single value.

eg. ...sort=sqrt(mapMultipleToOne(FQN, fieldname)) asc...

Uwe
(Sorry late reaction)




Re: Sorting on mutivalued fields still impossible?

2013-01-07 Thread Alexandre Rafalovitch
If the Multiple-to-one mapping would be stable (e.g. independent of a
query), why not implement it as a custom update.chain processor with a copy
to a separate field? There is already a couple of implementations
under FieldValueMutatingUpdateProcessor (first, last, max, min).

Regards,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Mon, Jan 7, 2013 at 8:19 AM, Uwe Reh r...@hebis.uni-frankfurt.de wrote:

 Am 31.08.2012 13:35, schrieb Erick Erickson:

 ... what would the correct behavior

 be for sorting on a multivalued field


 Hi Erick,

 in generally you are right, the question of multivalued fields is which
 value the reference is. But there are thousands of cases where this
 question is implicit answered. See my example ...sort=max(datefield)
 desc It is obvious, that the newest date should win. I see no reason
 why simple filters like max can't handle multivalued fields.

 Now four month's later i still wounder, why there is no pluginable
 function to map multivalued fields into a single value.
 eg. ...sort=sqrt(**mapMultipleToOne(FQN, fieldname)) asc...

 Uwe
 (Sorry late reaction)





Re: Sorting on mutivalued fields still impossible?

2013-01-07 Thread Uwe Reh

Hi,

like I just wrote in my reply to the similar suggestion form Jack.
I'm not looking for a way to preprocess my data.

My question is, why do i need two redundant fields to sort a multivalued 
field ('date_max' and 'date_min' for 'date')

For me it's just a waste of space, poisoning the fieldcache.

There is also an other class of problems, where a filterfunction like 
'mapMultipleToOne' may helpful. In the thread 'theory of sets' (this 
list) I described a hack with the function strdist, an own class and the 
mapping of a multiple values as a cvs list in a single value field.


Uwe




Am 07.01.2013 14:54, schrieb Alexandre Rafalovitch:

If the Multiple-to-one mapping would be stable (e.g. independent of a
query), why not implement it as a custom update.chain processor with a copy
to a separate field? There is already a couple of implementations
under FieldValueMutatingUpdateProcessor (first, last, max, min).

Regards,
Alex.





SOLR Cloud : what is the best backup/restore strategy ?

2013-01-07 Thread LEFEBVRE Guillaume
Hello,

Using a SOLR Cloud architecture, what is the best procedure to backup and 
restore SOLR index and configuration ?

Thanks,
Guillaume



RE: theory of sets

2013-01-07 Thread Petersen, Robert
Hi Uwe,

We have hundreds of dynamic fields but since most of our docs only use some of 
them it doesn't seem to be a performance drag.  They can be viewed as a sparse 
matrix of fields in your indexed docs.  Then if you make the 
sortinfo_for_groupx an int then that could be used in a function query to 
perform your sorting.  See  http://wiki.apache.org/solr/FunctionQuery


Robi

-Original Message-
From: Uwe Reh [mailto:r...@hebis.uni-frankfurt.de] 
Sent: Thursday, January 03, 2013 1:10 PM
To: solr-user@lucene.apache.org
Subject: theory of sets

Hi,

I'm looking for a tricky solution of a common problem. I have to handle a lot 
of items and each could be member of several groups.
- OK, just add a field called 'member_of'

No that's not enough, because each group is sorted and each member has a 
sortstring for this group.
- OK, still easy add a dynamic field 'sortinfo_for_*' and fill this for each 
group membership.

Yes, this works, but there are thousands of different groups, that much dynamic 
fields are probably a serious performance issue.
- Well ...

I'm looking for a smart way to answer to the question Find the members of 
group X and sort them by the the sortstring for this group.

One idea I had was to fill the 'member_of' field with composed entrys 
(groupname + _ + sortstring). Finding the members is easy with wildcards but 
there seems to be no way to use the sortstring as a boostfactor

Has anybody solved this problem?
Any hints are welcome.

Uwe



Re: theory of sets

2013-01-07 Thread Uwe Reh

Hi Robi,

thank you for the contribution. It's exiting to read, that your index 
isn't contaminated by the number of fields. I can't exclude other 
mistakes, but my first experience with extensive use of dynamic fields 
have been very poor response times.


Even though I found an other solution, I should give the straight 
forward solution a second chance.


Uwe

Am 07.01.2013 17:40, schrieb Petersen, Robert:

Hi Uwe,

We have hundreds of dynamic fields but since most of our docs only use some of 
them it doesn't seem to be a performance drag.  They can be viewed as a sparse 
matrix of fields in your indexed docs.  Then if you make the 
sortinfo_for_groupx an int then that could be used in a function query to 
perform your sorting.  See  http://wiki.apache.org/solr/FunctionQuery




No live SolrServers Solr 4 exceptions on trying to create a collection

2013-01-07 Thread Jay Parashar
Any clue to why this is happening will be greatly appreciated. This has become 
a blocker for me.
I can use the HTTPSolrServer to create a core/make requests etc, but then it 
behaves like Solr 3.6
http://host:port/solr/admin/cores and not 
http://host:port/solr/admin/collections

With my setup (4 servers running at localhost 8983, 8900, 7574 and 7500) when I 
manually do a 
http://127.0.0.1:7500/solr/admin/cores?action=CREATEname=myColl1instanceDir=defaultdataDir=myColl1Datacollection=myColl1numShards=2
it creates the collection only at the 7500 server. This is similar to when I 
use HttpSolrServer (Solr 3.6 behavior).

And of course when I initiate a 
http://127.0.0.1:7500/solr/admin/collections?action=CREATEname=myColl2instanceDir=defaultdataDir=myColl2Datacollection=myColl2numShards=2
as expected it creates the collection spread on 2 servers. I am failing to 
achieve the same with SolrJ. As in the code at the bottom of the mail, I use 
CloudSolrServer and get the No live SolrServers exception.

Any help or direction will of how to create collections (using the collections 
API) using SolrJ will be highly appreciated.

Regards
Jay


-Original Message-
From: Jay Parashar [mailto:jparas...@itscape.com] 
Sent: Sunday, January 06, 2013 7:42 PM
To: solr-user@lucene.apache.org
Subject: RE: Solr 4 exceptions on trying to create a collection

The exception No live SolrServers is being thrown when trying to create a new 
Collection ( code at end of this mail). On the CloudSolrServer request method, 
we have this line ClientUtils.appendMap(coll, slices, 
clusterState.getSlices(coll)); where coll is the new collection I am trying 
to create and hence clusterState.getSlices(coll)); is returning null.
And then the loop of the slices which adds to the urlList never happens and 
hence the LBHttpSolrServer created in the CloudSolrServer has a null url list 
in the constructor.
This is giving the No live SolrServers exception.

What I am missing?

Instead of passing the CloudSolrServer to the create.process, if I pass the 
LBHttpSolrServer  (server.getLbServer()), the collection gets created but only 
on one server.

My code to create a new Cloud Server and new Collection:-

String[] urls = 
{http://127.0.0.1:8983/solr/,http://127.0.0.1:8900/solr/,http://127.0.0.1:7500/solr/,http://127.0.0.1:7574/solr/};
CloudSolrServer server = new CloudSolrServer(127.0.0.1:2181, new 
LBHttpSolrServer(urls)); 
server.getLbServer().getHttpClient().getParams().setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT,
 5000); 
server.getLbServer().getHttpClient().getParams().setParameter(CoreConnectionPNames.SO_TIMEOUT,
 2); server.setDefaultCollection(collectionName);
server.connect();
CoreAdminRequest.Create create = new CoreAdminRequest.Create(); 
create.setCoreName(myColl); create.setCollection(myColl); 
create.setInstanceDir(defaultDir);
create.setDataDir(myCollData);
create.setNumShards(2);
create.process(server); //Exception No live SolrServers  is thrown here


Thanks
Jay


-Original Message-
From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
Sent: Friday, January 04, 2013 6:08 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr 4 exceptions on trying to create a collection

Tried Wireshark yet to see what host/port it is trying to connect and why it 
fails? It is a complex tool, but well worth learning.

Regards,
  Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at once. 
Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Fri, Jan 4, 2013 at 6:58 PM, Jay Parashar jparas...@itscape.com wrote:

 Thanks! I had a different version of httpclient in the classpath. So 
 the 2nd exception is gone but now I am  back to the first one 
 org.apache.solr.client.solrj.SolrServerException: No live SolrServers 
 available to handle this request

 -Original Message-
 From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
 Sent: Friday, January 04, 2013 4:21 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Solr 4 exceptions on trying to create a collection

 For the second one:

 Wrong version of library on a classpath or multiple versions of 
 library on the classpath which causes wrong classes with missing 
 fields/variables? Or library interface baked in and the implementation 
 is newer. Some sort of mismatch basically. Most probably in Apache http 
 library.

 Regards,
Alex.

 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all 
 at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)


 On Fri, Jan 4, 2013 at 4:34 PM, Jay Parashar jparas...@itscape.com
 wrote:

 
  Hi All,
 
  I am getting exceptions on trying to create a collection. Any help 
  is appreciated.
 
  While trying to create a collection, I got this 

Re: No live SolrServers Solr 4 exceptions on trying to create a collection

2013-01-07 Thread Rafał Kuć
Hello!

Can you share the command you use to start all four Solr servers ?

-- 
Regards,
 Rafał Kuć
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch

 Any clue to why this is happening will be greatly appreciated. This has 
 become a blocker for me.
 I can use the HTTPSolrServer to create a core/make requests etc, but then it 
 behaves like Solr 3.6
 http://host:port/solr/admin/cores and not
 http://host:port/solr/admin/collections

 With my setup (4 servers running at localhost 8983, 8900, 7574 and 7500) when 
 I manually do a
 http://127.0.0.1:7500/solr/admin/cores?action=CREATEname=myColl1instanceDir=defaultdataDir=myColl1Datacollection=myColl1numShards=2
 it creates the collection only at the 7500 server. This is similar
 to when I use HttpSolrServer (Solr 3.6 behavior).

 And of course when I initiate a 
 http://127.0.0.1:7500/solr/admin/collections?action=CREATEname=myColl2instanceDir=defaultdataDir=myColl2Datacollection=myColl2numShards=2
 as expected it creates the collection spread on 2 servers. I am
 failing to achieve the same with SolrJ. As in the code at the bottom
 of the mail, I use CloudSolrServer and get the No live SolrServers exception.

 Any help or direction will of how to create collections (using the
 collections API) using SolrJ will be highly appreciated.

 Regards
 Jay


 -Original Message-
 From: Jay Parashar [mailto:jparas...@itscape.com] 
 Sent: Sunday, January 06, 2013 7:42 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Solr 4 exceptions on trying to create a collection

 The exception No live SolrServers is being thrown when trying to
 create a new Collection ( code at end of this mail). On the
 CloudSolrServer request method, we have this line
 ClientUtils.appendMap(coll, slices, clusterState.getSlices(coll));
 where coll is the new collection I am trying to create and hence
 clusterState.getSlices(coll)); is returning null.
 And then the loop of the slices which adds to the urlList never
 happens and hence the LBHttpSolrServer created in the
 CloudSolrServer has a null url list in the constructor.
 This is giving the No live SolrServers exception.

 What I am missing?

 Instead of passing the CloudSolrServer to the create.process, if I
 pass the LBHttpSolrServer  (server.getLbServer()), the collection
 gets created but only on one server.

 My code to create a new Cloud Server and new Collection:-

 String[] urls =
 {http://127.0.0.1:8983/solr/,http://127.0.0.1:8900/solr/,http://127.0.0.1:7500/solr/,http://127.0.0.1:7574/solr/};
 CloudSolrServer server = new CloudSolrServer(127.0.0.1:2181, new
 LBHttpSolrServer(urls));
 server.getLbServer().getHttpClient().getParams().setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT,
 5000);
 server.getLbServer().getHttpClient().getParams().setParameter(CoreConnectionPNames.SO_TIMEOUT,
 2); server.setDefaultCollection(collectionName);
 server.connect();
 CoreAdminRequest.Create create = new CoreAdminRequest.Create();
 create.setCoreName(myColl); create.setCollection(myColl);
 create.setInstanceDir(defaultDir);
 create.setDataDir(myCollData);
 create.setNumShards(2);
 create.process(server); //Exception No live SolrServers  is thrown here


 Thanks
 Jay


 -Original Message-
 From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
 Sent: Friday, January 04, 2013 6:08 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Solr 4 exceptions on trying to create a collection

 Tried Wireshark yet to see what host/port it is trying to connect
 and why it fails? It is a complex tool, but well worth learning.

 Regards,
   Alex.

 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening
 all at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD 
 book)


 On Fri, Jan 4, 2013 at 6:58 PM, Jay Parashar jparas...@itscape.com wrote:

 Thanks! I had a different version of httpclient in the classpath. So 
 the 2nd exception is gone but now I am  back to the first one 
 org.apache.solr.client.solrj.SolrServerException: No live SolrServers 
 available to handle this request

 -Original Message-
 From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
 Sent: Friday, January 04, 2013 4:21 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Solr 4 exceptions on trying to create a collection

 For the second one:

 Wrong version of library on a classpath or multiple versions of 
 library on the classpath which causes wrong classes with missing 
 fields/variables? Or library interface baked in and the implementation 
 is newer. Some sort of mismatch basically. Most probably in Apache http 
 library.

 Regards,
Alex.

 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all 
 at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)


 On 

Re: Will SolrCloud always slice by ID hash?

2013-01-07 Thread Scott Stults
Thanks guys. Yeah, separate rolling collections seem like the better way to
go.


-Scott

On Sat, Dec 29, 2012 at 1:30 AM, Otis Gospodnetic 
otis.gospodne...@gmail.com wrote:

 https://issues.apache.org/jira/browse/SOLR-4237


Re: No live SolrServers Solr 4 exceptions on trying to create a collection

2013-01-07 Thread Mark Miller

On Jan 7, 2013, at 12:33 PM, Jay Parashar jparas...@itscape.com wrote:

 With my setup (4 servers running at localhost 8983, 8900, 7574 and 7500) when 
 I manually do a 
 http://127.0.0.1:7500/solr/admin/cores?action=CREATEname=myColl1instanceDir=defaultdataDir=myColl1Datacollection=myColl1numShards=2
 it creates the collection only at the 7500 server. This is similar to when I 
 use HttpSolrServer (Solr 3.6 behavior).

This only starts one core. If you want to use the CoreAdmin API you would need 
to make four calls, one to each server.

If you want this done for you, you must use the Collections API - see the wiki: 
http://wiki.apache.org/solr/SolrCloud#Managing_collections_via_the_Collections_API

- Mark

RE: No live SolrServers Solr 4 exceptions on trying to create a collection

2013-01-07 Thread Jay Parashar
Hi Rafat,

The following are scripts started in the same order (external zk, 1 instance 
running at localhost:2181). I also tried with the embedded zk with the same 
result

#Start of Server 1
export SOLR_HOME=/home/apache-solr-4.0.0
cd shard1A
java \
 -Djetty.port=8983 \
 -Djetty.home=$SOLR_HOME/example/ \
 -Dsolr.solr.home=multicore \
 -Dbootstrap_confdir=./multicore/defaultCore/conf \
 -Dcollection.configName=defaultConfig \
 -DzkHost=localhost:2181 \
 -DnumShards=2 \
 -jar $SOLR_HOME/example/start.jar

#Start of Server 2
export SOLR_HOME=/home/apache-solr-4.0.0
cd shard2A
java \
 -Djetty.port=8900 \
 -Djetty.home=$SOLR_HOME/example/ \
 -Dsolr.solr.home=multicore \
 -DzkHost=localhost:2181 \
 -jar $SOLR_HOME/example/start.jar

#Start of Server 3
export SOLR_HOME=/home/apache-solr-4.0.0
cd shard1B
java \
 -Djetty.port=7574 \
 -Djetty.home=$SOLR_HOME/example/ \
 -Dsolr.solr.home=multicore \
 -DzkHost=localhost:2181 \
 -jar $SOLR_HOME/example/start.jar

#Start of Server 4
export SOLR_HOME=/home/apache-solr-4.0.0
cd shard2B
java \
 -Djetty.port=7500 \
 -Djetty.home=$SOLR_HOME/example/ \
 -Dsolr.solr.home=multicore \
 -DzkHost=localhost:2181 \
 -jar $SOLR_HOME/example/start.jar

Regards
Jay

-Original Message-
From: Rafał Kuć [mailto:r@solr.pl] 
Sent: Monday, January 07, 2013 11:44 AM
To: solr-user@lucene.apache.org
Subject: Re: No live SolrServers Solr 4 exceptions on trying to create a 
collection

Hello!

Can you share the command you use to start all four Solr servers ?

-- 
Regards,
 Rafał Kuć
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch

 Any clue to why this is happening will be greatly appreciated. This has 
 become a blocker for me.
 I can use the HTTPSolrServer to create a core/make requests etc, but then it 
 behaves like Solr 3.6
 http://host:port/solr/admin/cores and not
 http://host:port/solr/admin/collections

 With my setup (4 servers running at localhost 8983, 8900, 7574 and 7500) when 
 I manually do a
 http://127.0.0.1:7500/solr/admin/cores?action=CREATEname=myColl1instanceDir=defaultdataDir=myColl1Datacollection=myColl1numShards=2
 it creates the collection only at the 7500 server. This is similar
 to when I use HttpSolrServer (Solr 3.6 behavior).

 And of course when I initiate a 
 http://127.0.0.1:7500/solr/admin/collections?action=CREATEname=myColl2instanceDir=defaultdataDir=myColl2Datacollection=myColl2numShards=2
 as expected it creates the collection spread on 2 servers. I am
 failing to achieve the same with SolrJ. As in the code at the bottom
 of the mail, I use CloudSolrServer and get the No live SolrServers exception.

 Any help or direction will of how to create collections (using the
 collections API) using SolrJ will be highly appreciated.

 Regards
 Jay


 -Original Message-
 From: Jay Parashar [mailto:jparas...@itscape.com] 
 Sent: Sunday, January 06, 2013 7:42 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Solr 4 exceptions on trying to create a collection

 The exception No live SolrServers is being thrown when trying to
 create a new Collection ( code at end of this mail). On the
 CloudSolrServer request method, we have this line
 ClientUtils.appendMap(coll, slices, clusterState.getSlices(coll));
 where coll is the new collection I am trying to create and hence
 clusterState.getSlices(coll)); is returning null.
 And then the loop of the slices which adds to the urlList never
 happens and hence the LBHttpSolrServer created in the
 CloudSolrServer has a null url list in the constructor.
 This is giving the No live SolrServers exception.

 What I am missing?

 Instead of passing the CloudSolrServer to the create.process, if I
 pass the LBHttpSolrServer  (server.getLbServer()), the collection
 gets created but only on one server.

 My code to create a new Cloud Server and new Collection:-

 String[] urls =
 {http://127.0.0.1:8983/solr/,http://127.0.0.1:8900/solr/,http://127.0.0.1:7500/solr/,http://127.0.0.1:7574/solr/};
 CloudSolrServer server = new CloudSolrServer(127.0.0.1:2181, new
 LBHttpSolrServer(urls));
 server.getLbServer().getHttpClient().getParams().setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT,
 5000);
 server.getLbServer().getHttpClient().getParams().setParameter(CoreConnectionPNames.SO_TIMEOUT,
 2); server.setDefaultCollection(collectionName);
 server.connect();
 CoreAdminRequest.Create create = new CoreAdminRequest.Create();
 create.setCoreName(myColl); create.setCollection(myColl);
 create.setInstanceDir(defaultDir);
 create.setDataDir(myCollData);
 create.setNumShards(2);
 create.process(server); //Exception No live SolrServers  is thrown here


 Thanks
 Jay


 -Original Message-
 From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
 Sent: Friday, January 04, 2013 6:08 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Solr 4 exceptions on trying to create 

RE: No live SolrServers Solr 4 exceptions on trying to create a collection

2013-01-07 Thread Jay Parashar
Right Mark,

I am accessing the Collections API using Solrj. This is where I am stuck. If
I just use the Collections API using http thru the browser, the behavior is
as expected. Is there an example of using the Collections API using SolrJ?
My code looks like

String[] urls =
{http://127.0.0.1:8983/solr/,http://127.0.0.1:8900/solr/,http://127.0.0
.1:7500/solr/,http://127.0.0.1:7574/solr/};
CloudSolrServer server = new CloudSolrServer(127.0.0.1:2181, new
LBHttpSolrServer(urls));
server.getLbServer().getHttpClient().getParams().setParameter(CoreConnection
PNames.CONNECTION_TIMEOUT, 5000);
server.getLbServer().getHttpClient().getParams().setParameter(CoreConnection
PNames.SO_TIMEOUT, 2); server.setDefaultCollection(collectionName);
server.connect();
CoreAdminRequest.Create create = new CoreAdminRequest.Create();
create.setCoreName(myColl); create.setCollection(myColl);
create.setInstanceDir(defaultDir);
create.setDataDir(myCollData);
create.setNumShards(2);
create.process(server); //Exception No live SolrServers  is thrown here

Regards
Jay

-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com] 
Sent: Monday, January 07, 2013 11:57 AM
To: solr-user@lucene.apache.org
Subject: Re: No live SolrServers Solr 4 exceptions on trying to create a
collection


On Jan 7, 2013, at 12:33 PM, Jay Parashar jparas...@itscape.com wrote:

 With my setup (4 servers running at localhost 8983, 8900, 7574 and 7500)
when I manually do a 

http://127.0.0.1:7500/solr/admin/cores?action=CREATEname=myColl1instanceDi
r=defaultdataDir=myColl1Datacollection=myColl1numShards=2
 it creates the collection only at the 7500 server. This is similar to when
I use HttpSolrServer (Solr 3.6 behavior).

This only starts one core. If you want to use the CoreAdmin API you would
need to make four calls, one to each server.

If you want this done for you, you must use the Collections API - see the
wiki:
http://wiki.apache.org/solr/SolrCloud#Managing_collections_via_the_Collectio
ns_API

- Mark



Re: No live SolrServers Solr 4 exceptions on trying to create a collection

2013-01-07 Thread Alexandre Rafalovitch
Can you run the SolrJ client from another machine (so you go over the
network) and put Wireshark in between? It will tell you if something is
actually trying to connect of if the problem is even earlier.

Otherwise, if you are on U*ix style machines look into dtrace/truss to see
the activity. On Windows machines look at ProcessMonitor from Sysinternals.

These are all 'hammer' size tools, but if you are truly stuck, they could
be a way forward.

Good luck,
   Alex.

Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


On Mon, Jan 7, 2013 at 1:20 PM, Jay Parashar jparas...@itscape.com wrote:

 Right Mark,

 I am accessing the Collections API using Solrj. This is where I am stuck.
 If
 I just use the Collections API using http thru the browser, the behavior is
 as expected. Is there an example of using the Collections API using SolrJ?
 My code looks like

 String[] urls =
 {http://127.0.0.1:8983/solr/,http://127.0.0.1:8900/solr/,;
 http://127.0.0
 .1:7500/solr/,http://127.0.0.1:7574/solr/};
 CloudSolrServer server = new CloudSolrServer(127.0.0.1:2181, new
 LBHttpSolrServer(urls));

 server.getLbServer().getHttpClient().getParams().setParameter(CoreConnection
 PNames.CONNECTION_TIMEOUT, 5000);

 server.getLbServer().getHttpClient().getParams().setParameter(CoreConnection
 PNames.SO_TIMEOUT, 2); server.setDefaultCollection(collectionName);
 server.connect();
 CoreAdminRequest.Create create = new CoreAdminRequest.Create();
 create.setCoreName(myColl); create.setCollection(myColl);
 create.setInstanceDir(defaultDir);
 create.setDataDir(myCollData);
 create.setNumShards(2);
 create.process(server); //Exception No live SolrServers  is thrown here

 Regards
 Jay

 -Original Message-
 From: Mark Miller [mailto:markrmil...@gmail.com]
 Sent: Monday, January 07, 2013 11:57 AM
 To: solr-user@lucene.apache.org
 Subject: Re: No live SolrServers Solr 4 exceptions on trying to create a
 collection


 On Jan 7, 2013, at 12:33 PM, Jay Parashar jparas...@itscape.com wrote:

  With my setup (4 servers running at localhost 8983, 8900, 7574 and 7500)
 when I manually do a
 

 http://127.0.0.1:7500/solr/admin/cores?action=CREATEname=myColl1instanceDi
 r=defaultdataDir=myColl1Datacollection=myColl1numShards=2
  it creates the collection only at the 7500 server. This is similar to
 when
 I use HttpSolrServer (Solr 3.6 behavior).

 This only starts one core. If you want to use the CoreAdmin API you would
 need to make four calls, one to each server.

 If you want this done for you, you must use the Collections API - see the
 wiki:

 http://wiki.apache.org/solr/SolrCloud#Managing_collections_via_the_Collectio
 ns_API

 - Mark




Re: How to size a SOLR Cloud

2013-01-07 Thread Otis Gospodnetic
Hello FF,

Something like SPM for Solr will help you understand what's making Solr
slow - CPU maxed? Disk IO? Swapping? Caches too small? ...

There are no general rules/recipes, but once you see what is going on we
can provide guidance.

Yes, you can have 1 or more replicas of a shard.

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Mon, Jan 7, 2013 at 12:14 PM, f.fourna...@gibmedia.fr 
f.fourna...@gibmedia.fr wrote:

 Hello,
 I'm new in SOLR and I've a collection with 25 millions of records.
 I want to run this collection on SOLR Cloud (sorl 4.0) under Amazon EC2
 instances.
 Currently I've configured 2 shards and 2 replica per shard with Medium
 instances (4Go, 1 CPU core) and response times are very long.
 How to size the cloud (sharding, replica, memory, CPU,...) to have
 acceptable response times in my situation? more memory ? more cpu ? more
 shards ? Does rules to size a solr cloud exists ?
 Is it possible to have more than 2 replicas on one shard ? is it relevant ?
 Best regards
 FF



Re: SOLR Cloud : what is the best backup/restore strategy ?

2013-01-07 Thread Otis Gospodnetic
Hi,

There may be a better way, but stopping indexing and then
using http://master_host:port/solr/replication?command=backup on each node
may do the backup trick.  I'd love to see how/if others do it.

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Mon, Jan 7, 2013 at 10:33 AM, LEFEBVRE Guillaume 
guillaume.lefeb...@cegedim.fr wrote:

 Hello,

 Using a SOLR Cloud architecture, what is the best procedure to backup and
 restore SOLR index and configuration ?

 Thanks,
 Guillaume




Re: Sorting on mutivalued fields still impossible?

2013-01-07 Thread Chris Hostetter

: My question is, why do i need two redundant fields to sort a multivalued field
: ('date_max' and 'date_min' for 'date')
: For me it's just a waste of space, poisoning the fieldcache.

how does two fields poion the fieldcache ? ... if there was a function 
that could find the min or max value of a multi-valued field, it would 
need to construct an UInvertedField of all N of the field values of each 
doc in order to find the min/max at query time -- by pre-computing a 
min_field and max_field at indexing time you only need FieldCache's for 
those 2 fields (where 2 = N, and N may be very big)

Generall speaking: most solr use cases are willing to pay a slightly 
higher indexing cost (time/cpu) to have faster searches -- which answers 
your earlier question...

 Now four month's later i still wounder, why there is no pluginable 
 function to map multivalued fields into a single value.

...because no one has written/contributed these functions (because most 
people would rather pay that cost at indexing time)



-Hoss


Re: SOLR Cloud : what is the best backup/restore strategy ?

2013-01-07 Thread Mark Miller
You should be able to continue indexing fine - it will just keep a point in 
time snapshot around until the copy is done. So you can trigger a backup at 
anytime to create a backup for that specific time, and keep indexing away, and 
the next night do the same thing. You will always have backed up to the point 
in time the backup command is received.

- Mark

On Jan 7, 2013, at 1:45 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote:

 Hi,
 
 There may be a better way, but stopping indexing and then
 using http://master_host:port/solr/replication?command=backup on each node
 may do the backup trick.  I'd love to see how/if others do it.
 
 Otis
 --
 Solr  ElasticSearch Support
 http://sematext.com/
 
 
 
 
 
 On Mon, Jan 7, 2013 at 10:33 AM, LEFEBVRE Guillaume 
 guillaume.lefeb...@cegedim.fr wrote:
 
 Hello,
 
 Using a SOLR Cloud architecture, what is the best procedure to backup and
 restore SOLR index and configuration ?
 
 Thanks,
 Guillaume
 
 



Re: No live SolrServers Solr 4 exceptions on trying to create a collection

2013-01-07 Thread Mark Miller
 http://127.0.0.1:7500/solr/admin/cores?

Why did you paste that as the example then :) ?

4.0 has problems using the collections api with the CloudSolrServer. You will 
be able to do it for 4.1, but for 4.0 you have to use an HttpSolrServer and 
pick a node to talk to. For 4.0, CloudSolrServer is just good for querying and 
updating.

- Mark

On Jan 7, 2013, at 1:20 PM, Jay Parashar jparas...@itscape.com wrote:

 Right Mark,
 
 I am accessing the Collections API using Solrj. This is where I am stuck. If
 I just use the Collections API using http thru the browser, the behavior is
 as expected. Is there an example of using the Collections API using SolrJ?
 My code looks like
 
 String[] urls =
 {http://127.0.0.1:8983/solr/,http://127.0.0.1:8900/solr/,http://127.0.0
 .1:7500/solr/,http://127.0.0.1:7574/solr/};
 CloudSolrServer server = new CloudSolrServer(127.0.0.1:2181, new
 LBHttpSolrServer(urls));
 server.getLbServer().getHttpClient().getParams().setParameter(CoreConnection
 PNames.CONNECTION_TIMEOUT, 5000);
 server.getLbServer().getHttpClient().getParams().setParameter(CoreConnection
 PNames.SO_TIMEOUT, 2); server.setDefaultCollection(collectionName);
 server.connect();
 CoreAdminRequest.Create create = new CoreAdminRequest.Create();
 create.setCoreName(myColl); create.setCollection(myColl);
 create.setInstanceDir(defaultDir);
 create.setDataDir(myCollData);
 create.setNumShards(2);
 create.process(server); //Exception No live SolrServers  is thrown here
 
 Regards
 Jay
 
 -Original Message-
 From: Mark Miller [mailto:markrmil...@gmail.com] 
 Sent: Monday, January 07, 2013 11:57 AM
 To: solr-user@lucene.apache.org
 Subject: Re: No live SolrServers Solr 4 exceptions on trying to create a
 collection
 
 
 On Jan 7, 2013, at 12:33 PM, Jay Parashar jparas...@itscape.com wrote:
 
 With my setup (4 servers running at localhost 8983, 8900, 7574 and 7500)
 when I manually do a 
 
 http://127.0.0.1:7500/solr/admin/cores?action=CREATEname=myColl1instanceDi
 r=defaultdataDir=myColl1Datacollection=myColl1numShards=2
 it creates the collection only at the 7500 server. This is similar to when
 I use HttpSolrServer (Solr 3.6 behavior).
 
 This only starts one core. If you want to use the CoreAdmin API you would
 need to make four calls, one to each server.
 
 If you want this done for you, you must use the Collections API - see the
 wiki:
 http://wiki.apache.org/solr/SolrCloud#Managing_collections_via_the_Collectio
 ns_API
 
 - Mark
 



Re: SOLR Cloud : what is the best backup/restore strategy ?

2013-01-07 Thread Michel Dion
Is it possible to restore an index (previously backed up) using the same
kind of http reste like request ? Something like
 ...solr/replication?command=restore ?

On Mon, Jan 7, 2013 at 2:12 PM, Mark Miller markrmil...@gmail.com wrote:

 You should be able to continue indexing fine - it will just keep a point
 in time snapshot around until the copy is done. So you can trigger a backup
 at anytime to create a backup for that specific time, and keep indexing
 away, and the next night do the same thing. You will always have backed up
 to the point in time the backup command is received.

 - Mark

 On Jan 7, 2013, at 1:45 PM, Otis Gospodnetic otis.gospodne...@gmail.com
 wrote:

  Hi,
 
  There may be a better way, but stopping indexing and then
  using http://master_host:port/solr/replication?command=backup on each
 node
  may do the backup trick.  I'd love to see how/if others do it.
 
  Otis
  --
  Solr  ElasticSearch Support
  http://sematext.com/
 
 
 
 
 
  On Mon, Jan 7, 2013 at 10:33 AM, LEFEBVRE Guillaume 
  guillaume.lefeb...@cegedim.fr wrote:
 
  Hello,
 
  Using a SOLR Cloud architecture, what is the best procedure to backup
 and
  restore SOLR index and configuration ?
 
  Thanks,
  Guillaume
 
 




RE: No live SolrServers Solr 4 exceptions on trying to create a collection

2013-01-07 Thread Jay Parashar
Thanks Mark! I will wait for 4.1 then.

Actually I pasted both /admin/cores and /admin/collections to highlight that
the problem was only with SolrJ and both  admin/collections and
admin/collections were working as expected.

Sorry for the confusion.

Regards
Jay

-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com] 
Sent: Monday, January 07, 2013 1:14 PM
To: solr-user@lucene.apache.org
Subject: Re: No live SolrServers Solr 4 exceptions on trying to create a
collection

 http://127.0.0.1:7500/solr/admin/cores?

Why did you paste that as the example then :) ?

4.0 has problems using the collections api with the CloudSolrServer. You
will be able to do it for 4.1, but for 4.0 you have to use an HttpSolrServer
and pick a node to talk to. For 4.0, CloudSolrServer is just good for
querying and updating.

- Mark

On Jan 7, 2013, at 1:20 PM, Jay Parashar jparas...@itscape.com wrote:

 Right Mark,
 
 I am accessing the Collections API using Solrj. This is where I am stuck.
If
 I just use the Collections API using http thru the browser, the behavior
is
 as expected. Is there an example of using the Collections API using SolrJ?
 My code looks like
 
 String[] urls =

{http://127.0.0.1:8983/solr/,http://127.0.0.1:8900/solr/,http://127.0.0
 .1:7500/solr/,http://127.0.0.1:7574/solr/};
 CloudSolrServer server = new CloudSolrServer(127.0.0.1:2181, new
 LBHttpSolrServer(urls));

server.getLbServer().getHttpClient().getParams().setParameter(CoreConnection
 PNames.CONNECTION_TIMEOUT, 5000);

server.getLbServer().getHttpClient().getParams().setParameter(CoreConnection
 PNames.SO_TIMEOUT, 2); server.setDefaultCollection(collectionName);
 server.connect();
 CoreAdminRequest.Create create = new CoreAdminRequest.Create();
 create.setCoreName(myColl); create.setCollection(myColl);
 create.setInstanceDir(defaultDir);
 create.setDataDir(myCollData);
 create.setNumShards(2);
 create.process(server); //Exception No live SolrServers  is thrown here
 
 Regards
 Jay
 
 -Original Message-
 From: Mark Miller [mailto:markrmil...@gmail.com] 
 Sent: Monday, January 07, 2013 11:57 AM
 To: solr-user@lucene.apache.org
 Subject: Re: No live SolrServers Solr 4 exceptions on trying to create a
 collection
 
 
 On Jan 7, 2013, at 12:33 PM, Jay Parashar jparas...@itscape.com wrote:
 
 With my setup (4 servers running at localhost 8983, 8900, 7574 and 7500)
 when I manually do a 
 

http://127.0.0.1:7500/solr/admin/cores?action=CREATEname=myColl1instanceDi
 r=defaultdataDir=myColl1Datacollection=myColl1numShards=2
 it creates the collection only at the 7500 server. This is similar to
when
 I use HttpSolrServer (Solr 3.6 behavior).
 
 This only starts one core. If you want to use the CoreAdmin API you would
 need to make four calls, one to each server.
 
 If you want this done for you, you must use the Collections API - see the
 wiki:

http://wiki.apache.org/solr/SolrCloud#Managing_collections_via_the_Collectio
 ns_API
 
 - Mark
 



Re: SOLR Cloud : what is the best backup/restore strategy ?

2013-01-07 Thread Mark Miller
Not to my knowledge. You could do a delete all and then merge the index in with 
the core admin API, but that would be a less efficient copy basically, rather 
than a straight file move. There is not currently a restore command though. 
Also, keep in mind that unless you back up to a network store or I suppose 
another disk drive or something, your backup is pretty precarious.

- Mark

On Jan 7, 2013, at 2:21 PM, Michel Dion diom...@gmail.com wrote:

 Is it possible to restore an index (previously backed up) using the same
 kind of http reste like request ? Something like
 ...solr/replication?command=restore ?
 
 On Mon, Jan 7, 2013 at 2:12 PM, Mark Miller markrmil...@gmail.com wrote:
 
 You should be able to continue indexing fine - it will just keep a point
 in time snapshot around until the copy is done. So you can trigger a backup
 at anytime to create a backup for that specific time, and keep indexing
 away, and the next night do the same thing. You will always have backed up
 to the point in time the backup command is received.
 
 - Mark
 
 On Jan 7, 2013, at 1:45 PM, Otis Gospodnetic otis.gospodne...@gmail.com
 wrote:
 
 Hi,
 
 There may be a better way, but stopping indexing and then
 using http://master_host:port/solr/replication?command=backup on each
 node
 may do the backup trick.  I'd love to see how/if others do it.
 
 Otis
 --
 Solr  ElasticSearch Support
 http://sematext.com/
 
 
 
 
 
 On Mon, Jan 7, 2013 at 10:33 AM, LEFEBVRE Guillaume 
 guillaume.lefeb...@cegedim.fr wrote:
 
 Hello,
 
 Using a SOLR Cloud architecture, what is the best procedure to backup
 and
 restore SOLR index and configuration ?
 
 Thanks,
 Guillaume
 
 
 
 



Solr cloud not starting properly. Only starts leaders.

2013-01-07 Thread davers
Every time I stop my SolrCloud (3 shards, 1 replica each, total 6 servers)
and then restart it I get the following error:

SEVERE: Error getting leader from zk
org.apache.solr.common.SolrException: Could not get leader props
at 
org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:709)
at 
org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:673)
at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:638)
at org.apache.solr.cloud.ZkController.register(ZkController.java:577)
at org.apache.solr.cloud.ZkController.register(ZkController.java:532)
at 
org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:709)
at org.apache.solr.core.CoreContainer.register(CoreContainer.java:693)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:535)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:356)
at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:308)
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:107)
at
org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:278)
at
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:259)
at
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:383)
at
org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:104)
at
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4650)
at
org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5306)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
at
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:901)
at 
org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:877)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:633)
at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:977)
at
org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1655)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for /collections/productindex/leaders/shard1
at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927)
at
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:244)
at
org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:241)
at
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:63)
at 
org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:241)
at 
org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:687)
... 28 more

Jan 07, 2013 1:23:50 PM org.apache.solr.common.SolrException log
SEVERE: :org.apache.solr.common.SolrException: Error getting leader from zk
at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:662)
at org.apache.solr.cloud.ZkController.register(ZkController.java:577)
at org.apache.solr.cloud.ZkController.register(ZkController.java:532)
at 
org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:709)
at org.apache.solr.core.CoreContainer.register(CoreContainer.java:693)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:535)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:356)
at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:308)
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:107)
at
org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:278)
at
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:259)
at
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:383)
at
org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:104)
at
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4650)
at
org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5306)
at 

Re: SOLR Cloud : what is the best backup/restore strategy ?

2013-01-07 Thread Marcin Rzewucki
There's no problem with indexing while taking snapshot. The only issue I
found is some problem with index directory:
https://issues.apache.org/jira/browse/SOLR-4170
It looks like Solr always looks in .../data/index/ directory without
reading index.properties file (sometimes your index dir name can be like
index.date). So far, it's easy to find workaround for this, but it
should work in 4.1 hopefully. As far as I checked snapshotting process is
harmless for Solr performance, so it is reliable.
In case of index recovery you can use files from your last snapshot and
just send updates newer than this. At least, that's what I do and it works
pretty fine.

Regards.

On 7 January 2013 20:27, Mark Miller markrmil...@gmail.com wrote:

 Not to my knowledge. You could do a delete all and then merge the index in
 with the core admin API, but that would be a less efficient copy basically,
 rather than a straight file move. There is not currently a restore command
 though. Also, keep in mind that unless you back up to a network store or I
 suppose another disk drive or something, your backup is pretty precarious.

 - Mark

 On Jan 7, 2013, at 2:21 PM, Michel Dion diom...@gmail.com wrote:

  Is it possible to restore an index (previously backed up) using the same
  kind of http reste like request ? Something like
  ...solr/replication?command=restore ?
 
  On Mon, Jan 7, 2013 at 2:12 PM, Mark Miller markrmil...@gmail.com
 wrote:
 
  You should be able to continue indexing fine - it will just keep a point
  in time snapshot around until the copy is done. So you can trigger a
 backup
  at anytime to create a backup for that specific time, and keep indexing
  away, and the next night do the same thing. You will always have backed
 up
  to the point in time the backup command is received.
 
  - Mark
 
  On Jan 7, 2013, at 1:45 PM, Otis Gospodnetic 
 otis.gospodne...@gmail.com
  wrote:
 
  Hi,
 
  There may be a better way, but stopping indexing and then
  using http://master_host:port/solr/replication?command=backup on each
  node
  may do the backup trick.  I'd love to see how/if others do it.
 
  Otis
  --
  Solr  ElasticSearch Support
  http://sematext.com/
 
 
 
 
 
  On Mon, Jan 7, 2013 at 10:33 AM, LEFEBVRE Guillaume 
  guillaume.lefeb...@cegedim.fr wrote:
 
  Hello,
 
  Using a SOLR Cloud architecture, what is the best procedure to backup
  and
  restore SOLR index and configuration ?
 
  Thanks,
  Guillaume
 
 
 
 




Re: theory of sets

2013-01-07 Thread Upayavira
Dynamic fields resulted in poor response times? How many fields did each
document have? I can't see how a dynamic field should have any
difference from any other field in terms of response time.

Or are you querying across a large number of dynamic fields
concurrently? I can imagine that slowing things down.

Upayavira

On Mon, Jan 7, 2013, at 05:18 PM, Uwe Reh wrote:
 Hi Robi,
 
 thank you for the contribution. It's exiting to read, that your index 
 isn't contaminated by the number of fields. I can't exclude other 
 mistakes, but my first experience with extensive use of dynamic fields 
 have been very poor response times.
 
 Even though I found an other solution, I should give the straight 
 forward solution a second chance.
 
 Uwe
 
 Am 07.01.2013 17:40, schrieb Petersen, Robert:
  Hi Uwe,
 
  We have hundreds of dynamic fields but since most of our docs only use some 
  of them it doesn't seem to be a performance drag.  They can be viewed as a 
  sparse matrix of fields in your indexed docs.  Then if you make the 
  sortinfo_for_groupx an int then that could be used in a function query to 
  perform your sorting.  See  http://wiki.apache.org/solr/FunctionQuery
 


RE: theory of sets

2013-01-07 Thread Zhang, Lisheng
Hi,

Just thought this possibility: I think dynamic field is solr concept, on lcene
level all fields are the same, but in initial startup, lucene should load all 
field information into memory (not field data, but schema).

If we have too many fields (like *_my_fields, * = a1, a2, ...), does this take 
too much memory and slow down performance (even if very few fields are really 
used)?

Best regards, Lisheng

-Original Message-
From: Upayavira [mailto:u...@odoko.co.uk]
Sent: Monday, January 07, 2013 2:57 PM
To: solr-user@lucene.apache.org
Subject: Re: theory of sets


Dynamic fields resulted in poor response times? How many fields did each
document have? I can't see how a dynamic field should have any
difference from any other field in terms of response time.

Or are you querying across a large number of dynamic fields
concurrently? I can imagine that slowing things down.

Upayavira

On Mon, Jan 7, 2013, at 05:18 PM, Uwe Reh wrote:
 Hi Robi,
 
 thank you for the contribution. It's exiting to read, that your index 
 isn't contaminated by the number of fields. I can't exclude other 
 mistakes, but my first experience with extensive use of dynamic fields 
 have been very poor response times.
 
 Even though I found an other solution, I should give the straight 
 forward solution a second chance.
 
 Uwe
 
 Am 07.01.2013 17:40, schrieb Petersen, Robert:
  Hi Uwe,
 
  We have hundreds of dynamic fields but since most of our docs only use some 
  of them it doesn't seem to be a performance drag.  They can be viewed as a 
  sparse matrix of fields in your indexed docs.  Then if you make the 
  sortinfo_for_groupx an int then that could be used in a function query to 
  perform your sorting.  See  http://wiki.apache.org/solr/FunctionQuery
 


Re: When does Solr actually convert textual representation into non-text formats (e.g. Date)

2013-01-07 Thread Chris Hostetter
: Subject: When does Solr actually convert textual representation into non-text
: formats (e.g. Date)

The short answer is: any place you want.

At the lowest level, FieldType's are required to support converting 
(legal) String values into whatever native java object best represents 
their type -- but they are also allowed/encouraged to accept objects of 
that native type directly and use them as is.

In some cases, like with the XmlUpdateRequestHandler code , the raw 
sting input is left sa is and passed down to the FieldType, because the 
RequestHandler's parsing code shouldn't make assumptions about the field 
types -- in other cases, like the JavaBinUpdateRequestHandler the type 
info comes along with the data, so it can easily pass the 
Integer/Date/Whatever on to the FieldType.

In between, things like UpdateRequetProcessor's can convert from String to 
Date or vice versa as they see fit.

As for DIH: i'm not entirely sure all of the places where a String might 
be converted to a Date ... i think there are special transformers for 
that, but when dealing with things like jdbc datasources you 
frequently get a true Date object back from the jdbc connection and i 
*think* DIH uses those Date objects as is.

: 4) copyField

copy field is not something i've ever considered in this context ... i 
genuinely don't know what would happen if you copyField'd from a 
TrieDateField to TextField and your indexing code was providing a true 
Date object ... i suspect you'd get a simple date.toString() in the text 
field.


-Hoss


Re: Solr cloud not starting properly. Only starts leaders.

2013-01-07 Thread Mark Miller

On Jan 7, 2013, at 4:26 PM, davers dboych...@improvementdirect.com wrote:

 KeeperErrorCode = NoNode for /collections/productindex/leaders/shard1

Odd - offhand I don't recall something like this being brought up before. Is 
this new for you, or always existed? Solr 4.0?

As far as a key for the colors, there is an open JIRA issue for 4.1, and I 
think even a patch.

- Mark

Re: custom solr sort

2013-01-07 Thread andy
Hi Upayavira,

The custom sort field is not stored in the index, I want to archieve a
requirement that didfferent search users will get different search results
when  they search same keyword by my search engine, the search users have
relationship with the each result document in the solr. But the relationship
is provided by the other teams' rest service.
So the search sequence is as follows :
1. I add the search user's id in the solr query  ( i.e. :  
query.setParam(uid, vo.getUserId());)
   and specify my own request  hanlder *mysearch*  query.setParam(qt,
mysearch);

2.  MySortComponent set the custom sort as the first sort.
3.  MyComparatorSource got the uid ,and send request to a rest service,
get the relationship according the uid
4.sort the result

Do you have any suggestions?



Upayavira wrote
 Can you explain why you want to implement a different sort first? There
 may be other ways of achieving the same thing.
 
 Upayavira
 
 On Sun, Jan 6, 2013, at 01:32 AM, andy wrote:
 Hi,
 
 Maybe this is an old thread or maybe it's different with previous one.
 
 I want to custom solr sort and  pass solr param from client to solr
 server,
 so I  implemented SearchComponent which named MySortComponent in my code,
 and also implemented FieldComparatorSource and FieldComparator. when I
 use
 mysearch requesthandler(see following codes), I found that custom sort
 just effect on the current page when I got multiple page results, but the
 sort is expected when I sets the rows which contains  all the results.
 Does
 anybody know how to solve it or the reason?
 
 code snippet:
 
 public class MySortComponent extends SearchComponent implements
 SolrCoreAware {
   
 public void inform(SolrCore arg0) {
 }
 
 @Override
 public void prepare(ResponseBuilder rb) throws IOException {
 SolrParams params = rb.req.getParams();
  String uid = params.get(uid)
  private RestTemplate restTemplate = new RestTemplate();
  
 MyComparatorSource comparator = new MyComparatorSource(uid);
 SortSpec sortSpec = rb.getSortSpec();
 if (sortSpec.getSort() == null) {
 sortSpec.setSort(new Sort(new SortField[] {
 new SortField(relation,
 comparator),SortField.FIELD_SCORE }));
   
 } else {
   
 SortField[] current = sortSpec.getSort().getSort();
 ArrayList
 SortField
  sorts = new ArrayList
 SortField
 (
 current.length + 1);
 sorts.add(new SortField(relation, comparator));
 for (SortField sf : current) {
 sorts.add(sf);
 }
 sortSpec.setSort(new Sort(sorts.toArray(new
 SortField[sorts.size()])));
   
 }
 
 }
 
 @Override
 public void process(ResponseBuilder rb) throws IOException {
 
 }
 
 //
 -
 // SolrInfoMBean
 //
 -
 
 @Override
 public String getDescription() {
 return Custom Sorting;
 }
 
 @Override
 public String getSource() {
 return ;
 }
 
 @Override
 public URL[] getDocs() {
 try {
 return new URL[] { new URL(
 http://wiki.apache.org/solr/QueryComponent;) };
 } catch (MalformedURLException e) {
 throw new RuntimeException(e);
 }
 }
 
 public class MyComparatorSource extends FieldComparatorSource {
 private BitSet dg1;
 private BitSet dg2;
 private BitSet dg3;
 
 public MyComparatorSource(String uid) throws IOException {
 
 SearchResponse responseBody = restTemplate.postForObject(
 http://search.test.com/userid/search/; + uid, null,
 SearchResponse.class);
 
 String d1 = responseBody.getOneDe();
 String d2 = responseBody.getTwoDe();
 String d3 = responseBody.getThreeDe();
 
 if (StringUtils.hasLength(d1)) {
 byte[] bytes = Base64.decodeBase64(d1);
 dg1 = BitSetHelper.loadFromBzip2ByteArray(bytes);
 }
  
 if (StringUtils.hasLength(d2)) {
 byte[] bytes = Base64.decodeBase64(d2);
 dg2 = BitSetHelper.loadFromBzip2ByteArray(bytes);
 }

 if (StringUtils.hasLength(d3)) {
 byte[] bytes = Base64.decodeBase64(d3);
 dg3 = BitSetHelper.loadFromBzip2ByteArray(bytes);
 }

 }
 
 @Override
 public FieldComparator newComparator(String fieldname,
 final int numHits, int sortPos, boolean reversed)
 throws IOException {
 return new RelationComparator(fieldname, numHits);
 }
 
 class 

Re: custom solr sort

2013-01-07 Thread Chris Hostetter

: mysearch requesthandler(see following codes), I found that custom sort
: just effect on the current page when I got multiple page results, but the
: sort is expected when I sets the rows which contains  all the results. Does
: anybody know how to solve it or the reason?

I haven't familiarized myself with the lucene sort code in a while, and 
much of your custom sort code is greek to me, but this one method does 
jump out at me...


: @Override
: public int compareDocToValue(int arg0, Object arg1)
: throws IOException {
: // TODO Auto-generated method stub
: return 0;
: }

...i'm pretty sure you need to implement that method correctly to get 
meaningful sort ordering.


FWIW: If i was in your place, and had an external REST service that 
provided me with the sort values to use for each doc's unique key, given a 
users unique id, my first inclination would not be to implement it as a 
custom SearchComponent.

My first inclination would be to implement it as a custom 
ValueSourceParser (returning a custom ValueSource), and then leverage the 
function query syntax in the sort (ie: sort=myFunction(the_user_id) asc) 
... that should mean a lot less non-sort related code you have to write.  
(or if i was still using Solr 3.6.x, i might implement a special FieldType 
-- using RandomFiled as inspiration and then register it with a UID__* 
dynamicField so sort=UID__the_user_id asc called my REST service using 
'the_user_id' as input)



-Hoss


Re: Solr Cloud not electing leader properly

2013-01-07 Thread Mark Miller
Please see: 
http://lucene.472066.n3.nabble.com/Attention-Solr-4-0-SolrCloud-users-td4024998.html

- Mark

On Jan 7, 2013, at 9:16 PM, davers dboych...@improvementdirect.com wrote:

 I have a SolrCloud as seen here: http://d.pr/i/ya86
 
 When I stop solr-shard-1 solr-shard-4 should become the new leader. Instead
 it does not. Here is the output from the logs.
 
 INFO: A cluster state change has occurred - updating...
 Jan 07, 2013 6:11:54 PM org.apache.solr.cloud.ShardLeaderElectionContext
 runLeaderProcess
 INFO: Running the leader process.
 Jan 07, 2013 6:11:54 PM org.apache.solr.cloud.ShardLeaderElectionContext
 shouldIBeLeader
 INFO: Checking if I should try and be the leader.
 Jan 07, 2013 6:11:54 PM org.apache.solr.cloud.ShardLeaderElectionContext
 shouldIBeLeader
 INFO: My last published State was Active, it's okay to be the leader.
 Jan 07, 2013 6:11:54 PM org.apache.solr.cloud.ShardLeaderElectionContext
 runLeaderProcess
 INFO: I may be the new leader - try and sync
 Jan 07, 2013 6:11:54 PM org.apache.solr.cloud.RecoveryStrategy close
 WARNING: Stopping recovery for
 zkNodeName=solr-shard-4.sys.id.build.com:8080_solr_productindexcore=productindex
 Jan 07, 2013 6:11:54 PM org.apache.solr.cloud.SyncStrategy sync
 INFO: Sync replicas to
 http://solr-shard-4.sys.id.build.com:8080/solr/productindex/
 Jan 07, 2013 6:11:54 PM org.apache.solr.update.PeerSync sync
 INFO: PeerSync: core=productindex
 url=http://solr-shard-4.sys.id.build.com:8080/solr START
 replicas=[http://solr-shard-1.sys.id.build.com:8080/solr/productindex/]
 nUpdates=100
 Jan 07, 2013 6:11:54 PM org.apache.solr.update.PeerSync handleResponse
 WARNING: PeerSync: core=productindex
 url=http://solr-shard-4.sys.id.build.com:8080/solr  exception talking to
 http://solr-shard-1.sys.id.build.com:8080/solr/productindex/, failed
 org.apache.solr.client.solrj.SolrServerException: IOException occured when
 talking to server at:
 http://solr-shard-1.sys.id.build.com:8080/solr/productindex
   at
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:413)
   at
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
   at
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:166)
   at
 org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:133)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 Caused by: java.net.SocketException: Connection reset
   at java.net.SocketInputStream.read(SocketInputStream.java:189)
   at java.net.SocketInputStream.read(SocketInputStream.java:121)
   at
 org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:149)
   at
 org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:111)
   at
 org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:264)
   at
 org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:98)
   at
 org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:252)
   at
 org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:282)
   at
 org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:247)
   at
 org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:216)
   at
 org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:298)
   at
 org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
   at
 org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:647)
   at
 org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:464)
   at
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
   at
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
   at
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:732)
   at
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:352)
   ... 11 more
 
 Jan 07, 2013 6:11:54 PM org.apache.solr.update.PeerSync sync
 INFO: PeerSync: core=productindex
 url=http://solr-shard-4.sys.id.build.com:8080/solr DONE. sync failed
 Jan 

Re: SOLR Cloud : what is the best backup/restore strategy ?

2013-01-07 Thread Otis Gospodnetic
Hi,

Right, you can continue indexing, but if you need to run
 http://master_host:port/solr/replication?command=backup  on each node and
if you want a snapshot that represents a specific index state, then you
need to stop indexing (and hard commit).  That's what I had in mind.  But
if one just wants *some* snapshot and it doesn't matter that a snapshot on
each node is a from a slightly different time with a slightly different
index make up, so to speak, then yes, just continue indexing.

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Mon, Jan 7, 2013 at 2:12 PM, Mark Miller markrmil...@gmail.com wrote:

 You should be able to continue indexing fine - it will just keep a point
 in time snapshot around until the copy is done. So you can trigger a backup
 at anytime to create a backup for that specific time, and keep indexing
 away, and the next night do the same thing. You will always have backed up
 to the point in time the backup command is received.

 - Mark

 On Jan 7, 2013, at 1:45 PM, Otis Gospodnetic otis.gospodne...@gmail.com
 wrote:

  Hi,
 
  There may be a better way, but stopping indexing and then
  using http://master_host:port/solr/replication?command=backup on each
 node
  may do the backup trick.  I'd love to see how/if others do it.
 
  Otis
  --
  Solr  ElasticSearch Support
  http://sematext.com/
 
 
 
 
 
  On Mon, Jan 7, 2013 at 10:33 AM, LEFEBVRE Guillaume 
  guillaume.lefeb...@cegedim.fr wrote:
 
  Hello,
 
  Using a SOLR Cloud architecture, what is the best procedure to backup
 and
  restore SOLR index and configuration ?
 
  Thanks,
  Guillaume
 
 




Re: Atomicity of commits (soft OR hard) across replicas - Solr Cloud

2013-01-07 Thread samarth s
Thanks *Tomás !! *This was useful.


On Mon, Dec 31, 2012 at 6:03 PM, Tomás Fernández Löbbe 
tomasflo...@gmail.com wrote:

 If by cronned commit you mean auto-commit: auto-commits are local to
 each node, are not distributed, so there is no something like a
 cluster-wide atomicity there. The commit may be performed in one node
 now, and in other nodes in 5 minutes (depending on the maxTime you have
 configured).
 If you mean that you are issuing commits from outside Solr, those are going
 to be by default distributed to all the nodes. The operation will succeed
 only if all nodes succeed, but if one of the nodes fail, the operation will
 fail. However, the nodes that did succeed WILL have a new view of the index
 at this point. (I'm not sure if something is done in this situation with
 the failing node).

 The local commit operation in one node *is* atomic.

 Tomás


 On Mon, Dec 31, 2012 at 7:04 AM, samarth s samarth.s.seksa...@gmail.com
 wrote:

  Tried reading articles online, but could not find one that confirmed the
  same 100% :).
 
  Does a cronned soft commit complete its commit cycle only after all the
  replicas have the newest data visible ?
 
  --
  Regards,
  Samarth
 




-- 
Regards,
Samarth


Re: How to size a SOLR Cloud

2013-01-07 Thread Per Steffensen

Hi

I have some experience with practical limits. We have several setup we 
have tried to run with high load for long time:

1)
* 20 shards in one collection spread over 5 nodes (4 shards for the 
collection per node), no redunancdy (only one replica per shard)

* Indexing 35-50 mio documents per day and searching a little along the way
* We do not have detailed measurements on searching, but my impression 
is that search response times are fairly ok (below 5 secs for 
non-complicated searches) - at least the first 15 days, up to about 500 
mio documents
* We have very detailed measurements on indexing times though. They are 
good the first 15-17 days, up to 500-600 mio documents. Then we see a 
temporary slow-down in indexing times. This is because major merges 
happen at the same time across all shards. The indexing times speed up 
when this is over, though. After about 20 days everything stops running 
- things just get too slow and eventually nothing happens.

2)
* Same as 1), except 40 shards in one collection spread over 10 nodes, 
no redundancy
* Slowdown points seems to change linearly - slow-down around 1 billion 
docs and complete stop 1.3-1.4 billion docs


Therefore it seems a little strange to me that you have problems with 25 
mio docs in two shards.
One major difference is the redundancy, though. We are having only one 
replica per shard. We started our trying to run with redundancy (2 
replica per shard) but that involved a lot of problems. Things never 
successfully recover when recover situations occur, and we see like 
4-times indexing times compared to non-redundancy (even though a max of 
2-times should be expected).


Regards, Per Steffensen


On 1/7/13 6:14 PM, f.fourna...@gibmedia.fr wrote:

Hello,
I'm new in SOLR and I've a collection with 25 millions of records.
I want to run this collection on SOLR Cloud (sorl 4.0) under Amazon EC2
instances.
Currently I've configured 2 shards and 2 replica per shard with Medium
instances (4Go, 1 CPU core) and response times are very long.
How to size the cloud (sharding, replica, memory, CPU,...) to have
acceptable response times in my situation? more memory ? more cpu ? more
shards ? Does rules to size a solr cloud exists ?
Is it possible to have more than 2 replicas on one shard ? is it relevant ?
Best regards
FF





Re: SOLR Cloud : what is the best backup/restore strategy ?

2013-01-07 Thread Marcin Rzewucki
Definitely. I agree. It's good to stop loading before snapshot. Anyway,
doing index snapshot say every 1 hour and re-indexing documents never than
last 1-1.5 hour should reduce your index recovery time.

On 8 January 2013 07:36, Otis Gospodnetic otis.gospodne...@gmail.comwrote:

 Hi,

 Right, you can continue indexing, but if you need to run
  http://master_host:port/solr/replication?command=backup  on each node and
 if you want a snapshot that represents a specific index state, then you
 need to stop indexing (and hard commit).  That's what I had in mind.  But
 if one just wants *some* snapshot and it doesn't matter that a snapshot on
 each node is a from a slightly different time with a slightly different
 index make up, so to speak, then yes, just continue indexing.

 Otis
 --
 Solr  ElasticSearch Support
 http://sematext.com/





 On Mon, Jan 7, 2013 at 2:12 PM, Mark Miller markrmil...@gmail.com wrote:

  You should be able to continue indexing fine - it will just keep a point
  in time snapshot around until the copy is done. So you can trigger a
 backup
  at anytime to create a backup for that specific time, and keep indexing
  away, and the next night do the same thing. You will always have backed
 up
  to the point in time the backup command is received.
 
  - Mark
 
  On Jan 7, 2013, at 1:45 PM, Otis Gospodnetic otis.gospodne...@gmail.com
 
  wrote:
 
   Hi,
  
   There may be a better way, but stopping indexing and then
   using http://master_host:port/solr/replication?command=backup on each
  node
   may do the backup trick.  I'd love to see how/if others do it.
  
   Otis
   --
   Solr  ElasticSearch Support
   http://sematext.com/
  
  
  
  
  
   On Mon, Jan 7, 2013 at 10:33 AM, LEFEBVRE Guillaume 
   guillaume.lefeb...@cegedim.fr wrote:
  
   Hello,
  
   Using a SOLR Cloud architecture, what is the best procedure to backup
  and
   restore SOLR index and configuration ?
  
   Thanks,
   Guillaume