RE: Disable hyper-threading for better Solr performance?

2016-03-09 Thread Avner Levy
Currently I'm using Solr 4.8.1 but I can move to another version if it performs 
significantly faster.
My target is to reach the max indexing throughput possible on the machine.
Since it seems the indexing process is CPU bound I was wondering whether 32 
logical cores with twice indexing threads will perform better.
Thanks,
 Avner

-Original Message-
From: Ilan Schwarts [mailto:ila...@gmail.com] 
Sent: Wednesday, March 09, 2016 9:09 AM
To: solr-user@lucene.apache.org
Subject: Re: Disable hyper-threading for better Solr performance?

What is the solr version and shard config? Standalone? Multiple cores?
Spread over RAID ?
On Mar 9, 2016 9:00 AM, "Avner Levy" <av...@checkpoint.com> wrote:

> I have a machine with 16 real cores (32 with HT enabled).
> I'm running on it a Solr server and trying to reach maximum 
> performance for indexing and queries (indexing 20k documents/sec by a 
> number of threads).
> I've read on multiple places that in some scenarios / products 
> disabling the hyper-threading may result in better performance results.
> I'm looking for inputs / insights about HT on Solr setups.
> Thanks in advance,
>   Avner
>


Email secured by Check Point


Disable hyper-threading for better Solr performance?

2016-03-08 Thread Avner Levy
I have a machine with 16 real cores (32 with HT enabled).
I'm running on it a Solr server and trying to reach maximum performance for 
indexing and queries (indexing 20k documents/sec by a number of threads).
I've read on multiple places that in some scenarios / products disabling the 
hyper-threading may result in better performance results.
I'm looking for inputs / insights about HT on Solr setups.
Thanks in advance,
  Avner


Distributed Search in Solr with different queries per shard

2014-05-21 Thread Avner Levy
I have 2 cores.
One with active data and one with historical data (for documents which were 
removed from the active one).
I want to run Distributed Search on both and get the unified result (as 
supported by Solr Distributed Search, I'm not using Solr Cloud).
My problem is that the query for each core is different.
Is there a way to specify different query per core and still let Solr to unify 
the query results?
For example:
Active data core query: select all green docs
History core query: select all green docs with year=2012
Is there a way to extend the distributed search handler to support such a 
scenario?
Thanks in advance,
  Avner
· One option is to send a unified query to both but then each core will 
work harder for no reason.



RE: Distributed Search in Solr with different queries per shard

2014-05-21 Thread Avner Levy
Yes, there is.
But since the real query is very long and complex per core, I don't want each 
core to work very hard on irrelevant query parts of other cores. 
Perhaps I can write some query plugin which will strip the unnecessary parts on 
each core?
Thanks,
  Avner

-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com] 
Sent: Wednesday, May 21, 2014 6:52 PM
To: solr-user@lucene.apache.org
Subject: Re: Distributed Search in Solr with different queries per shard

Unfortunately the same query will be sent to all cores if you use the shards 
parameter to query multiple cores.

Is there some characteristic of the first core that is distinct from the second 
core so that you could OR the differences between the two?

-- Jack Krupansky

-Original Message-
From: Avner Levy
Sent: Wednesday, May 21, 2014 9:56 AM
To: solr-user@lucene.apache.org
Subject: Distributed Search in Solr with different queries per shard

I have 2 cores.
One with active data and one with historical data (for documents which were 
removed from the active one).
I want to run Distributed Search on both and get the unified result (as 
supported by Solr Distributed Search, I'm not using Solr Cloud).
My problem is that the query for each core is different.
Is there a way to specify different query per core and still let Solr to unify 
the query results?
For example:
Active data core query: select all green docs History core query: select all 
green docs with year=2012 Is there a way to extend the distributed search 
handler to support such a scenario?
Thanks in advance,
  Avner
· One option is to send a unified query to both but then each core 
will work harder for no reason.


Email secured by Check Point


RE: Distributed Search in Solr with different queries per shard

2014-05-21 Thread Avner Levy
I believe unifying multiple query results including facets, paging, sorts and 
other extra features on my own in the application is complex as well.
Is there some Solr code I can use in the application level to unify multiple 
results? (this can be actually an interesting direction)
The queries were of course just an example. In real life I have 4 cores with 
very complex queries for each so unifying all 4 may cause a significant 
overhead on the system, especially if there are tens of such queries per second.
Thanks,
  Avner

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Wednesday, May 21, 2014 6:13 PM
To: solr-user@lucene.apache.org
Subject: Re: Distributed Search in Solr with different queries per shard

I suppose you could, but I _really_ question whether it's a wise investment in 
time. Personally I'd treat them as two different collections and have the app 
layer fire off two queries and do the aggregation (this is a variant of 
federated search I think). This removes your issue with having the cores do 
extra work

Additionally, I'd really prove out that the extra work is actually a 
measurable performance issue before worrying about this, it smells like 
premature optimization.

FWIW,
Erick

On Wed, May 21, 2014 at 6:56 AM, Avner Levy av...@checkpoint.com wrote:
 I have 2 cores.
 One with active data and one with historical data (for documents which were 
 removed from the active one).
 I want to run Distributed Search on both and get the unified result (as 
 supported by Solr Distributed Search, I'm not using Solr Cloud).
 My problem is that the query for each core is different.
 Is there a way to specify different query per core and still let Solr to 
 unify the query results?
 For example:
 Active data core query: select all green docs History core query: 
 select all green docs with year=2012 Is there a way to extend the 
 distributed search handler to support such a scenario?
 Thanks in advance,
   Avner
 · One option is to send a unified query to both but then each core 
 will work harder for no reason.


Email secured by Check Point


Storing ranges on documents and searching all document with specific value included

2014-01-17 Thread Avner Levy
I have millions of documents with the following fields:
name (string), start version (int), end version (int).



I need to query efficiently all records which answers the query:
Select all documents where version = start version and version=end version

Running the above query took 50-100 ms while similar query by tagging each 
version took only 15 ms.
My question is how efficient can Solr handle such queries? (since it isn't 
classic FTS query)
Do I need to define something special in order to optimize performance?
Any alternate solutions will be welcomed.
The fields values / types can be changed if needed.



Re: Adding documents in Solr plugin

2013-10-23 Thread Avner Levy
I've tried to write the plugin code.
Currently I do:
AddUpdateCommand addUpdateCommand = new
AddUpdateCommand(solrQueryRequest);
DocIterator iterator = docList.iterator();
SolrIndexSearcher indexReader =
solrQueryRequest.getSearcher();
while (iterator.hasNext()) {
Document document = indexReader.doc(iterator.nextDoc());
SolrInputDocument solrInputDocument = new
SolrInputDocument();
addUpdateCommand.clear();
addUpdateCommand.solrDoc = solrInputDocument;
addUpdateCommand.solrDoc.setField(id,
document.get(id));
addUpdateCommand.solrDoc.setField(my_updated_field,
new_value);
updateRequestProcessor.processAdd(addUpdateCommand);
}
But this is very expensive since the update handler will fetch again the
document which I already hold at hand. 
Is there a safe way to update the lucene document and write it back while
taking into account all the Solr related code such as caches, extra solr
logic, etc?
I was thinking of converting it to a SolrInputDocument and then just add the
document through Solr but I need first to convert all fields.
Thanks in advance,
  Avner



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Adding-documents-in-Solr-plugin-tp4071574p4097168.html
Sent from the Solr - User mailing list archive at Nabble.com.


Adding documents in Solr plugin

2013-06-19 Thread Avner Levy
I have a core with millions of records.
I want to add a custom handler which scan the existing documents and update one 
of the field  (delete and add document) based on a condition (age12 for 
example).
All fields are stored so there is no problem to recreate the document from the 
search result.
I prefer doing it on the Solr server side for avoiding sending millions of 
documents to the client and back.
I'm thinking of writing a solr plugin which will receive a query and update 
some fields on the query documents (like the delete by query handler).
Are existing solutions or better alternatives?
I couldn't find any examples of Solr plugins which update / add / delete 
documents (I don't need to extend the update handler).
If someone has an example it will be great help.
Thanks in advance


Enabling realtime search in Solr 4.0

2011-12-29 Thread Avner Levy
Hi,
I'm trying to enable realtime search in Solr 4.0 (So I can see new documents 
without committing).
I've added:
realtime visible=0 facet=truetrue/realtime
updateLog class=solr.FSUpdateLog
  str name=dir${solr.data.dir:}/str
/updateLog

But documents aren't seen before commit (or softCommit).
Any help will be appreciated.
Thanks,
Avner


RE: Enabling realtime search in Solr 4.0

2011-12-29 Thread Avner Levy
Thanks Mark, I appreciate your help.
I need the Solr index to be in sync with my database.
This means that even if one record was added I need it to appear in the next 
search (including faceting).
I've read in Solr-RA documentation that if you add realtimetrue/realtime 
you can add documents and search for them without any commit at all (and I 
assumed it is functionality of Solr).
So I guess there isn't a way to get such functionality in Solr 4.0, right? I 
think this relates to the ability to open readers from the writer if I 
understood it correctly?
Does anyone knows how different is Solr-RA from the regular Solr?
Thanks in advance,
  Avner


-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com] 
Sent: Thursday, December 29, 2011 5:16 PM
To: solr-user@lucene.apache.org
Subject: Re: Enabling realtime search in Solr 4.0


On Dec 29, 2011, at 3:39 AM, Avner Levy wrote:

 Hi,
 I'm trying to enable realtime search in Solr 4.0 (So I can see new documents 
 without committing).
 I've added:
 realtime visible=0 facet=truetrue/realtime
 updateLog class=solr.FSUpdateLog
  str name=dir${solr.data.dir:}/str
 /updateLog
 
 But documents aren't seen before commit (or softCommit).
 Any help will be appreciated.
 Thanks,
 Avner


This is how you enable soft auto commit in trunk: 
http://wiki.apache.org/solr/SolrConfigXml?#Update_Handler_Section

You do not need the update log for it - that is for realtime GET (where you 
would also need to set that up in a Request Handler).

Sounds like you are conflating the two.

- Mark Miller
lucidimagination.com












Scanned by Check Point Total Security Gateway.