Re: Upgraded to 4.10.3, highlighting performance unusably slow

2015-05-03 Thread jaime spicciati
We ran into this as well on 4.10.3 (not related to an upgrade). It was
identified during load testing when a small percentage of queries would
take more than 20 seconds to return. We were able to isolate it by
rerunning the same query multiple times and regardless of cache hits the
queries would still take a long time to return. We used this method to
narrow down the performance problem to a small number of very large records
(many many fields in a single record).

We fixed it by turning on hl.requireFieldMatch on the query so that only
fields that have an actual hit are passed through the highlighter.

Hopefully this helps,
Jaime Spicciati

On Sat, May 2, 2015 at 8:20 PM, Joel Bernstein joels...@gmail.com wrote:

 Hi,

 Can you also include the details of your research that narrowed the issue
 to the highlighter?

 Joel Bernstein
 http://joelsolr.blogspot.com/

 On Sat, May 2, 2015 at 5:27 PM, Ryan, Michael F. (LNG-DAY) 
 michael.r...@lexisnexis.com wrote:

  Are you able to identify if there is a particular part of the code that
 is
  slow?
 
  A simple way to do this is to use the jstack command (assuming your
 server
  has the full JDK installed). You can run it like this:
  /path/to/java/bin/jstack PID
 
  If you run that a bunch of times while your highlight query is running,
  you might be able to spot the hotspot. Usually I'll do something like
 this
  to see the stacktrace for the thread running the query:
  /path/to/java/bin/jstack PID | grep SearchHandler -B30
 
  A few more questions:
  - What are response times you are seeing before and after the upgrade? Is
  unusably slow 1 second, 10 seconds...?
  - If you run the exact same query multiple times, is it consistently
 slow?
  Or is it only slow on the first run?
  - While the query is running, do you see high user CPU on your server, or
  high IO wait, or both? (You can check this with the top command or vmstat
  command in Linux.)
 
  -Michael
 
  -Original Message-
  From: Cheng, Sophia Kuen [mailto:sophia_ch...@hms.harvard.edu]
  Sent: Saturday, May 02, 2015 4:13 PM
  To: solr-user@lucene.apache.org
  Subject: Upgraded to 4.10.3, highlighting performance unusably slow
 
  Hello,
 
  We recently upgraded solr from 3.8.0 to 4.10.3.  We saw that this upgrade
  caused a incredible slowdown in our searches. We were able to narrow it
  down to the highlighting. The slowdown is extreme enough that we are
  holding back our release until we can resolve this.  Our research
 indicated
  using TermVectors  FastHighlighter were the way to go, however this
 still
  does nothing for the performance. I think we may be overlooking a crucial
  configuration, but cannot figure it out. I was hoping for some guidance
 and
  help. Sorry for the long email, I wanted to provide enough information.
 
  Our documents are largely dynamic fields, and so we have been using ‘*’
 as
  the field for highlighting. This is the same setting as in prior versions
  of solr use. The dynamic fields are of type ’text’ and we added
  customizations to the schema.xml for the type ’text’:
 
  fieldType name=text class=solr.TextField positionIncrementGap=100
  storeOffsetsWithPositions=true termVectors=true termPositions=true
  termOffsets=true
analyzer type=index
  !--  this charFilter removes all xml-tagging from the text: --
  charFilter class=solr.HTMLStripCharFilterFactory/
  tokenizer class=solr.WhitespaceTokenizerFactory/
  !-- Case insensitive stop word removal.
add enablePositionIncrements=true in both the index and query
analyzers to leave a 'gap' for more accurate phrase queries.
  --
  filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt enablePositionIncrements=true/
  filter class=solr.WordDelimiterFilterFactory generateWordParts=1
  generateNumberParts=1 catenateWords=1 catenateNumbers=1
  catenateAll=0 splitOnCaseChange=1/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.SnowballPorterFilterFactory language=English
  protected=protwords.txt/
/analyzer
analyzer type=query
  !--  this charFilter removes all xml-tagging from the text. Needed
  also in query due to autosuggest --
  charFilter class=solr.HTMLStripCharFilterFactory/
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt enablePositionIncrements=true/
  filter class=solr.WordDelimiterFilterFactory generateWordParts=1
  generateNumberParts=1 catenateWords=0 catenateNumbers=0
  catenateAll=0 splitOnCaseChange=1/
  filter class=solr.LowerCaseFilterFactory/
  filter class=solr.SnowballPorterFilterFactory language=English
  protected=protwords.txt/
/analyzer
  /fieldType
 
  One of the two dynamic fields we use:
 
  dynamicField name=DTPropValue_*  type=textindexed=true
  stored=true required=false multiValued=true/
 
  In our solrConfig.xml file, we have:
 
  requestHandler name=/eiHandler class

Re: Java.net.socketexception: broken pipe Solr 4.10.2

2015-04-14 Thread jaime spicciati
We ran into this during our indexing process running on 4.10.3. After
increasing zookeeper timeouts, client timeouts, socket timeouts,
implementing retry logic on our loading process the thing that worked was
to change the Hard Commit timing. We were performing a Hard Commit every 5
minutes and after a couple hours of loading data some of the shards would
start going down because they would timeout with zookeeper and/or close
connections. Changing the timeouts just moved the problem later in the
ingest process.

Through a combination of decreasing the hard commit timing to 15 seconds,
and migrating to G1 garbage collect, we are able to prevent ingest
failures. For us the periodic stop the world garbage collects were causing
connections to be closed and other nasty things such as zookeeper timeouts
that would cause recovery to kick in. (Soft commits are turned off until
the full ingest/baseline completes). I believe until a Hard Commit is
issued Solr keeps the data in memory which explains why we were
experiencing nasty garbage collects.

The other change we made which may have helped is that we ensured the
socket timeouts were in sync between the jetty instance running Solr and
the SolrJ loading the data. During some of our batch updates Solr would
take a couple minutes to respond back which I believe in some instances the
socket server side would be closed (maxIdleTime setting in Jetty).

Hope this helps,
Jaime Spicciati

Thanks
Jaime


On Tue, Apr 14, 2015 at 9:26 AM, vsilgalis vsilga...@gmail.com wrote:

 Right now index size is about 10GB on each shard (yes I could use more
 RAM),
 but I'm looking more for a step up then step down approach.  I will try
 adding more RAM to these machines as my next step.

 1. Zookeeper is external to these boxes in a three node cluster with more
 than enough RAM to keep everything off disk.

 2. os disk cache, when I add more RAM I will just add it as RAM for the
 machine and not to the Java Heap unless that is something you recommend.

 3. java heap looks good so far, GC is minimal as far as i can tell but I
 can
 look into this some more.

 4. we do have 2 cores per machine, but the second core is a joke (10MB)

 note: zkClientTimeout is set to 30 for safety's sake.

 java settings:

 -XX:+CMSClassUnloadingEnabled-XX:+AggressiveOpts-XX:+ParallelRefProcEnabled-XX:+CMSParallelRemarkEnabled-XX:CMSMaxAbortablePrecleanTime=6000-XX:CMSTriggerPermRatio=80-XX:CMSInitiatingOccupancyFraction=50-XX:+UseCMSInitiatingOccupancyOnly-XX:CMSFullGCsBeforeCompaction=1-XX:PretenureSizeThreshold=64m-XX:+CMSScavengeBeforeRemark-XX:ParallelGCThreads=4-XX:ConcGCThreads=4-XX:+UseConcMarkSweepGC-XX:+UseParNewGC-XX:MaxTenuringThreshold=8-XX:TargetSurvivorRatio=90-XX:SurvivorRatio=4-XX:NewRatio=3-XX:-UseSuperWord-Xmx5588m-Xms1596m



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Java-net-socketexception-broken-pipe-Solr-4-10-2-tp4199484p4199561.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Leading Wildcard Support (ReversedWildcardFilterFactory)

2015-02-26 Thread jaime spicciati
All,
I am currently using 4.10.3 running Solr Cloud.

I have configured my index analyzer to leverage the
solr.ReversedWildcardFilterFactory with various settings for the
maxFractionAsterisk, maxPosAsterisk,etc. Currently I am running with the
defaults (ie not configured)

Using the Analysis capability in the Solr admin I see the Field Value
(Index) fields going in correctly, both normal order and reversed order.
However, on the Field Value (Query) side it is not generating a token
that is reversed as expected (no matter where I place the * in the leading
position of the search term). I also confirmed through the Query capability
with debugQuery turned on that the parsed query is not reversed as expected.

From my current understanding you do not need to have anything configured
on the index analyzer to make leading wildcards work as expected with the
reversedwildcardfilterfactory. The default query parser will know to look
at the index analyzer and leverage the ReversedWildcardFilterFactory
configuration if the term contains a leading wildcard. (This is what I have
read)

Without uploading my entire configuration to this email I was hoping
someone could point me in the right direction because I am at a loss at
this point.

Thanks!


Re: Leading Wildcard Support (ReversedWildcardFilterFactory)

2015-02-26 Thread jaime spicciati
Thanks for the quick response.

The index I am currently testing with has the following configuration which
is the default for the text_general_rev

The field type is solr.TextField

maxFractionAsterisk=.33
maxPosAsterisk=3
maxPosQuestion=2
withOriginal=true

Through additional review I think it *might *be working as expected even
though the Analysis tab and debugQuery parsed query lead me to think
otherwise. If I look at the explain plan from the debugQuery and I actually
get a hit, I see word/word(s) that actually come back in reversed order
with the \u0001 prefix character, so the actual hit against the inverted
index appears to be correct even though the parsed query doesn't reflect
this. Is it safe to say that things are in fact working correctly?

Thanks again



On Thu, Feb 26, 2015 at 3:34 PM, Jack Krupansky jack.krupan...@gmail.com
wrote:

 Please post your field type... or at least confirm a comparison to the
 example in the javadoc:

 http://lucene.apache.org/solr/4_10_3/solr-core/org/apache/solr/analysis/ReversedWildcardFilterFactory.html

 -- Jack Krupansky

 On Thu, Feb 26, 2015 at 2:38 PM, jaime spicciati 
 jaime.spicci...@gmail.com
 wrote:

  All,
  I am currently using 4.10.3 running Solr Cloud.
 
  I have configured my index analyzer to leverage the
  solr.ReversedWildcardFilterFactory with various settings for the
  maxFractionAsterisk, maxPosAsterisk,etc. Currently I am running with the
  defaults (ie not configured)
 
  Using the Analysis capability in the Solr admin I see the Field Value
  (Index) fields going in correctly, both normal order and reversed order.
  However, on the Field Value (Query) side it is not generating a token
  that is reversed as expected (no matter where I place the * in the
 leading
  position of the search term). I also confirmed through the Query
 capability
  with debugQuery turned on that the parsed query is not reversed as
  expected.
 
  From my current understanding you do not need to have anything configured
  on the index analyzer to make leading wildcards work as expected with the
  reversedwildcardfilterfactory. The default query parser will know to look
  at the index analyzer and leverage the ReversedWildcardFilterFactory
  configuration if the term contains a leading wildcard. (This is what I
 have
  read)
 
  Without uploading my entire configuration to this email I was hoping
  someone could point me in the right direction because I am at a loss at
  this point.
 
  Thanks!
 



Question about session affinity and SolrCloud

2015-02-14 Thread jaime spicciati
All,
This is my current understanding of how SolrCloud load balancing works...

Within SolrCloud, for a cluster with more than 1 shard and at least 1
replica, the Zookeeper aware SolrJ client uses LBHTTPSolrServer which is
round robin across the replicas and leaders in the cluster. In turn the
shard (which can be a leader or replica) that performs the distributed
query may then go to the leader or replica for each shard based on round
robin via LBHTTPSolrServer.

If this is correct then in a SolrCloud instance that has let's say 1
replica, the initial query from the user may go to the leader for shard 1,
then when the user paginates to the second page the subsequent query may go
to the replica of shard 1. This seems inefficient from a caching
perspective where the queryResultCache and possibly the filterCache would
need to be reloaded.

From what I can find there does not appear to be any option of session
affinity within the SolrCloud query execution?


Thanks!


SolrCloud multi-datacenter failover?

2015-01-02 Thread jaime spicciati
All,

At my current customer we have developed a custom federator that will
federate queries between Endeca and Solr to ease the transition from an
extremely large (TBs of data) Endeca index to Solr. (Endeca is similar to
Solr in terms of search/faceted navigation/etc).



During this transition plan we need to support multi datacenter failover
which we have historically handled via load balancers with the appropriate
failover configurations (think F5). We are currently playing our dataloads
into multiple datacenters to ensure data consistency. (Each datacenter has
a stand-alone instance of solrcloud with its own redundancy/failover)



I am curious to see how the community handles multi datacenter failureover
at the presentation layer (datacenter A goes down and we want to failover
to B). Solrcloud within a datacenter will handle single datacenter failure
within the instance, but in order to support multi datacenter failover I
haven't seen a definitive ‘answer’ as to how to handle this situation.



At this point the only two options I can come up with are

1) Fail the entire datacenter if Solrcloud goes offline (GUI/index/etc go
offline)

 - This is problematic because some portion of user activity will fail,
queries that are in transit will not complete

2) Implement failover at the custom federator level. In doing so we would
need to detect a failure at datacenter A within our federator, then query
datacenter B to fulfill the user request, then potentially fail the entire
datacenter A once all transactions have been fulfilled against A



Since we are looking up the active solr instance via zookeeper (solrcloud)
per datacenter I don’t see any reasonable means of failing over to another
datacenter if a given solrcloud instance goes down?


Any thoughts are welcome at this point?

Thanks

Jaime