SOLR capacity planning and Disaster relief

2012-10-21 Thread Worthy LaFollette
CAVEAT: I am a nubie w/r to SOLR (some Lucene experience, but not SOLR
itself.  Trying to come up to speed.


What have you all done w/r to SOLR capacity planning and disaster relief?

I am curious to the following metrics:

 - File handles and other ulimit/profile concerns
 - Space calculations (particularly w/r to optimizations, etc.)
 - Taxonomy considerations
 - Single Core vs. Multi-core
 - ?

Also, anyone plan for Disaster relief for SOLR across non-metro data
centers?   Currently not an issue for me, but will be shortly.


Re: Doing facet count using group truncating with distributed search

2012-10-21 Thread Erick Erickson
I think you're confusing cores and shards. The comments about distributed
functionality are for a _sharded_ index. A sharded index simply breaks up
a single logical index into parts (shards) and, when the configuration is
set
up, queries are automatically sent to all shards and the results collated.
Even
in this case, though, care must be taken that no documents in separate
shards have the same ID (uniqueKey) or your results will be wonky. So your
example won't work well even in a sharded situation.

But what you describe is separate cores. Cores are simply completely
distinct
indexes that happen to be served by a single Solr instance, they have
no knowledge of each other so I don't see how you'd be able to get what
you're looking for.

Best
Erick

P.S. I'm cheating a little when I say cores have no knowledge of each
other,
there is the  a restricted case of cross core joins, but
that's not germain to your problem.

On Sat, Oct 20, 2012 at 3:49 PM, Kenneth Vindum k...@industry-supply.dkwrote:

 Hi Solr users!

 ** **

 Could any of you tell me how to do a facet count across several cores
 excluding duplicates. Eg.

 ** **

 Core A:

 Page 1

 Id=a

 Text=hello world

 ** **

 Page 2

 Id=b

 Text=hello again

 ** **

 Core B:

 Page 1

 Id=a

 Text=Hej verden

 ** **

 Id=c

 Text=Ny besked

 ** **

 Doing a facet count on core A gives me 2 elements. Doing a facet count on
 core B gives me two element as well. Counting across both cores using
 shards should return 3 elements when doing group.truncate on the element
 with Id=a. This would work on a single core, but doing so on more than one
 core always gives me a facet count = 4.

 ** **

 I’ve read the solr 
 pagehttp://wiki.apache.org/solr/FieldCollapsing#Known_Limitationssaying
 

 Grouping is also supported for distributed searches from version [image:
 !] Solr3.5 http://wiki.apache.org/solr/Solr3.5 and from version [image:
 !] Solr4.0 http://wiki.apache.org/solr/Solr4.0. Currently
 group.truncate and group.func are the only parameters that aren't supported
 for distributed searches.

 ** **

 Is this because it’s not possible to make this feature, or is it because
 nobody needed it yet?

 ** **

 Thanks guys :)

 ** **

 Kind regards

 Kenneth Vindum

 ** **



Does SolrCloud support distributed IDFs?

2012-10-21 Thread Sascha Szott
Hi folks,

a known limitation of the old distributed search feature is the lack of 
distributed/global IDFs (#SOLR-1632). Does SolrCloud bring some improvements in 
this direction?

Best regards,
Sascha


Re: Open Source Social (London) - 23rd Oct

2012-10-21 Thread Richard Marr
Last reminder... come along on Tuesday if you can! We'd love to meet you
and share search/NLP/scaling war stories.



 On 11 October 2012 21:59, Richard Marr richard.m...@gmail.com wrote:

 Hi all,

 The next Open Source Search Social is on the 23rd Oct at The Plough, in
 Bloomsbury.

 We usually get a good mix of regulars and newcomers, and a good mix of
 backgrounds and experience levels, so please come along if you can. As
 usual the format is completely open so we'll be talking about whatever is
 most interesting at any one particular moment... ooo, a shiny thing...

 Details and RSVP options on the Meetup page:
 http://www.meetup.com/london-search-social/events/86580442/

 Hope to see you there,

 Richard

 @richmarr








-- 
Richard Marr


Why does SolrIndexSearcher.java enforce mutual exclusion of filter and filterList?

2012-10-21 Thread Aaron Daubman
Greetings,

I'm wondering if somebody would please explain why
SolrIndexSearcher.java enforces mutual exclusion of filter and
filterList
(e.g. see: 
https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L2039
)

For a custom application we have been using this functionality
successfully and I have been maintaining patches against base releases
up from 1.4 through 3.6.1 and am now finally looking at 4.0. Since I
am yet-again revisiting this custom patch, I am wondering why this
functionality is prevented out of the box - for two reasons really:

1) It would be great if I didn't have to maintain a custom internal
branch of solr for this tiny little change
2) I am worried that the purposeful prevention of this functionality
implies there is a downside to doing this.

Is there a downside to utilizing both a DocSet based filter and query
based filterList?
If not, once I migrate this patch to 4.0 what would be the best way to
get this functionality incorporated into the base?

For additional info, you may find the now-2-year-old issue with
patches addressing this up through 3.6.1 here:
https://issues.apache.org/jira/browse/SOLR-2052

Any insight appreciated as always,
 Aaron


Re: Why does SolrIndexSearcher.java enforce mutual exclusion of filter and filterList?

2012-10-21 Thread Yonik Seeley
On Sun, Oct 21, 2012 at 3:57 PM, Aaron Daubman daub...@gmail.com wrote:
 Greetings,

 I'm wondering if somebody would please explain why
 SolrIndexSearcher.java enforces mutual exclusion of filter and
 filterList
 (e.g. see: 
 https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L2039
 )

AFAIK, It's only prevented because it's not implemented (or at least
it wasn't in the past.)  Things may have changed since I restructured
filtering to implement post filters though.  If you remove the checks,
does everything actually work today?

It was never implemented because Solr didn't use it itself (and the
vast majority of people use Solr as-is w/o internal customization).
Should be fine to support as long as it doesn't complicate the
implementation too much.

-Yonik
http://lucidworks.com


Re: Does SolrCloud support distributed IDFs?

2012-10-21 Thread Mark Miller

On 10/21/2012 01:27 PM, Sascha Szott wrote:

Hi folks,

a known limitation of the old distributed search feature is the lack of 
distributed/global IDFs (#SOLR-1632). Does SolrCloud bring some improvements in this 
direction?

Best regards,
Sascha

Still waiting on that issue. I think Andrzej should just update it to 
trunk and commit - it's option and defaults to off. Go vote :)


- Mark


IOException occured when talking to server

2012-10-21 Thread Jason
Hi,
I'm encountering below error repeatedly when trying out distributed search.
At that time, every server was not stalled.
Has anyone know what the problem is?



2012-10-18 09:09:54,813 [http-8080-exec-8819] ERROR
org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException:
org.apache.solr.client.solrj.SolrServerException: IOException occured when
talking to server at: http://203.242.170.141:8080/solr_jt/jtp00
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1561)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:445)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:266)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:470)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at
org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.java:889)
at
org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:732)
at
org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:2262)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.solr.client.solrj.SolrServerException: IOException
occured when talking to server at: http://203.242.170.141:8080/solr_jt/jtp00
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:409)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:182)
at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:165)
at
org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:132)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
... 3 more
Caused by: java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:168)
at
org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:149)
at
org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:111)
at
org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:264)
at
org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:98)
at
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:252)
at
org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:282)
at
org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:247)
at
org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:216)
at
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:298)
at
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
at
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:647)
at
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:464)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:732)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:351)
... 11 more


RAMDirectory - still stores some docs on disk?

2012-10-21 Thread deniz
Hello 

I am using RAMDirectory for running some experiments and came up with a
weird (well, for me) situation. basically after indexing on RAM. i have
killed the JVM and then restarted it after some time. I can still see some
documents as indexed and searchable. I had indexed more than 2M docs before
shutting down and after restart there were around 15K docs still in the
index. 

So how could this happen? there is some caching mechanism which backups some
amount(?) of the total index or it is directly written on disk? (if so, how?
) 






-
Zeki ama calismiyor... Calissa yapar...
--
View this message in context: 
http://lucene.472066.n3.nabble.com/RAMDirectory-still-stores-some-docs-on-disk-tp4015022.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SOLR 4 BETA facet.pivot and cloud

2012-10-21 Thread dplutcho
When I run SOLR 4 BETA with zookeeper, even if I specify shards=1, pivoting
does not seem to work.

FYI, facet.pivot does work, in 4.0, on a per-shard basis if you include the
distrib=False argument. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SOLR-4-BETA-facet-pivot-and-cloud-tp4011841p4015030.html
Sent from the Solr - User mailing list archive at Nabble.com.