Re: Can the solr dataimporthandler consume an atom feed?

2014-03-21 Thread Gora Mohanty
On 22 March 2014 02:55, eShard  wrote:
>
> Good afternoon,
> I'm using solr 4.0 Final.
> I have an IBM atom feed I'm trying to index but it won't work.
> There are no errors in the log.
> All the other DIH I've created consumed RSS 2.0
> Does it NOT work with an atom feed?
[...]

Atom is XML, and your DIH data configuration file looks fine on the
face of it. What message do you get when you do a full-import?
Can you also provide a sample of your feed?

Regards,
Gora


Re: Required fields

2014-03-21 Thread Alexei Martchenko
false


alexei martchenko
Facebook  |
Linkedin|
Steam  |
4sq| Skype: alexeiramone |
Github  | (11) 9 7613.0966 |


2014-03-21 17:17 GMT-03:00 Walter Underwood :

> What is the default value for the required attribute of a field element in
> a schema? I've just looked everywhere I can think of in the wiki, the
> reference manual, and the JavaDoc. Most of the documentation doesn't even
> mention that attribute.
>
> Once we answer this, it should be added to the documented attributes for
> field.
>
> wunder
> --
> Walter Underwood
> wun...@wunderwood.org
>
>
>
>


Re: using SolrJ with SolrCloud, searching multiple indexes.

2014-03-21 Thread Furkan KAMACI
Hi Russell;

You say that:

  | CloudSolrServer server = new CloudSolrServer("solrServer1:
2111,solrServer2:2111,solrServer2:2111");

but I should mention that they are not Solr Servers that is passed into a
CloudSolrServer. They are zookeeper host:port pairs optionally includes a
chroot parameter at the end.

Thanks;
Furkan KAMACI



2014-03-21 18:11 GMT+02:00 Russell Taylor <
russell.tay...@interactivedata.com>:

> Hi,
> just started to move my SolrJ queries over to our SolrCloud  environment
> and I  want to know how to do a query  where you combine multiple indexes.
>
> Previously I had a string called shards which links all the indexes
> together and adds them to the query.
> String shards =
> "server:8080/solr_search/bonds,server:8080/solr_search/equities,etc"
> which I add to my SolrQuery
> solrQuery.add("shards",shards);
> I can then search across many indexes.
>
> In SolrCloud we do this
> CloudSolrServer server = new
> CloudSolrServer("solrServer1:2111,solrServer2:2111,solrServer2:2111");
> and add the default collection
> server.setDefaultCollection(bonds);
>
> How do I add the other indexes to my query in CloudSolrServer? If it's as
> before solrQuery.add("shards",shards); how do I find out the address of the
> machine CloudSolrServer has chosen?
>
>
>
> Thanks
>
>
> Russ.
>
>
> ***
> This message (including any files transmitted with it) may contain
> confidential and/or proprietary information, is the property of Interactive
> Data Corporation and/or its subsidiaries, and is directed only to the
> addressee(s). If you are not the designated recipient or have reason to
> believe you received this message in error, please delete this message from
> your system and notify the sender immediately. An unintended recipient's
> disclosure, copying, distribution, or use of this message or any
> attachments is prohibited and may be unlawful.
> ***
>


Can the solr dataimporthandler consume an atom feed?

2014-03-21 Thread eShard
Good afternoon,
I'm using solr 4.0 Final.
I have an IBM atom feed I'm trying to index but it won't work.
There are no errors in the log.
All the other DIH I've created consumed RSS 2.0
Does it NOT work with an atom feed?

here's my configuration:




https://[redacted]";
processor="XPathEntityProcessor"
forEach="/atom:feed/atom:entry"
transformer="DateFormatTransformer,TemplateTransformer">















 






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-the-solr-dataimporthandler-consume-an-atom-feed-tp4126134.html
Sent from the Solr - User mailing list archive at Nabble.com.


Required fields

2014-03-21 Thread Walter Underwood
What is the default value for the required attribute of a field element in a 
schema? I've just looked everywhere I can think of in the wiki, the reference 
manual, and the JavaDoc. Most of the documentation doesn't even mention that 
attribute.

Once we answer this, it should be added to the documented attributes for field.

wunder
--
Walter Underwood
wun...@wunderwood.org





Re: Limit on # of collections -SolrCloud

2014-03-21 Thread Chris W
Sorry for the piecemeal approach but had another question. I have a 3 zk
ensemble. Does making 2 zk as observer roles help speed up bootup of solr
(due to decrease in time it takes to decide leaders for shards)?


On Fri, Mar 21, 2014 at 11:49 AM, Chris W  wrote:

> Thanks Tim. I would definitely try that next time. I have seen a few
> instances where the overseer_queue not getting processed but that looks
> like an existing bug which got fixed in 4.6 (overseer doesnt process
> requests when reload collection fails)
>
> One question: Assuming our cluster can tolerate downtime of about 10-15
> minutes, is it ok to restart all solrnodes at the same time? or will there
> be race conditions while recovery?
>
>
>
>
> On Fri, Mar 21, 2014 at 11:08 AM, Mark Miller wrote:
>
>>
>> On March 21, 2014 at 1:46:13 PM, Tim Potter (tim.pot...@lucidworks.com)
>> wrote:
>>
>> We've seen instances where you end up restarting the overseer node each
>> time as you restart the cluster, which causes all kinds of craziness.
>>
>>
>> That would be a great test to add tot he suite.
>>
>> --
>> Mark Miller
>> about.me/markrmiller
>>
>>
>
>
> --
> Best
> --
> C
>



-- 
Best
-- 
C


Re: join and filter query with AND

2014-03-21 Thread Kranti Parisa
My example should also work, am I missing something?

&q=({!join from=inner_id to=outer_id fromIndex=othercore
v=$joinQuery})&joinQuery=(city:"Stara Zagora" AND prod:214)

Thanks,
Kranti K. Parisa
http://www.linkedin.com/in/krantiparisa



On Fri, Mar 21, 2014 at 2:11 PM, Yonik Seeley  wrote:

> Correct.  This is only a limitation of embedding a local-params style
> subquery within lucene syntax.
> The parser, not knowing the syntax of the embedded query, currently
> assumes the query text ends at whitespace or other special punctuation
> such as ")".
>
> Original:
> (({!join from=inner_id to=outer_id fromIndex=othercore}city:"Stara
> Zagora")) AND (prod:214)
>
> Some possible workarounds that should work:
> &q={!join from=inner_id to=outer_id fromIndex=othercore}city:"Stara Zagora"
> &fq=prod:214
>
> &q=({!join from=inner_id to=outer_id fromIndex=othercore
> v='city:"Stara Zagora"'} AND prod:214)
>
> &q=({!join from=inner_id to=outer_id fromIndex=othercore v=$jq} AND
> prod:214)
> &jq=city:"Stara Zagora"
>
>
> -Yonik
> http://heliosearch.org - solve Solr GC pauses with off-heap filters
> and fieldcache
>
>
> On Fri, Mar 21, 2014 at 1:54 PM, Jack Krupansky 
> wrote:
> > I suspect that this is a bug in the implementation of the parsing of
> > embedded nested query parsers . That's a fairly new feature compared to
> > non-embedded nested query parsers - maybe Yonik could shed some light.
> This
> > may date from when he made a copy of the Lucene query parser for Solr and
> > added the parsing of embedded nested query parsers to the grammar. It
> seems
> > like the embedded nested query parser is only being applied to a single,
> > white space-delimited term, and not respecting the fact that the term is
> a
> > quoted phrase.
> >
> > -- Jack Krupansky
> >
> > -Original Message- From: Marcin Rzewucki
> > Sent: Thursday, March 20, 2014 5:19 AM
> > To: solr-user@lucene.apache.org
> > Subject: Re: join and filter query with AND
> >
> >
> > Nope. There is no line break in the string and it is not feed from file.
> > What else could be the reason ?
> >
> >
> >
> > On 19 March 2014 17:57, Erick Erickson  wrote:
> >
> >> It looks to me like you're feeding this from some
> >> kind of text file and you really _do_ have a
> >> line break after "Stara
> >>
> >> Or have a line break in the string you paste into the URL
> >> or something similar.
> >>
> >> Kind of shooting in the dark though.
> >>
> >> Erick
> >>
> >> On Wed, Mar 19, 2014 at 8:48 AM, Marcin Rzewucki 
> >> wrote:
> >> > Hi,
> >> >
> >> > I have the following issue with join query parser and filter query.
> For
> >> > such query:
> >> >
> >> > *:*
> >> > 
> >> > (({!join from=inner_id to=outer_id fromIndex=othercore}city:"Stara
> >> > Zagora")) AND (prod:214)
> >> > 
> >> >
> >> > I got error:
> >> > 
> >> > 
> >> > org.apache.solr.search.SyntaxError: Cannot parse 'city:"Stara':
> Lexical
> >> > error at line 1, column 12. Encountered:  after : "\"Stara"
> >> > 
> >> > 400
> >> > 
> >> >
> >> > Stack:
> >> > DEBUG - 2014-03-19 13:35:20.825;
> >> org.eclipse.jetty.servlet.ServletHandler;
> >> > chain=SolrRequestFilter->default
> >> > DEBUG - 2014-03-19 13:35:20.826;
> >> > org.eclipse.jetty.servlet.ServletHandler$CachedChain; call filter
> >> > SolrRequestFilter
> >> > ERROR - 2014-03-19 13:35:20.828; org.apache.solr.common.SolrException;
> >> > org.apache.solr.common.SolrException: >
> >> > org.apache.solr.search.SyntaxError:
> >> > Cannot parse 'city:"Stara': Lexical error at line 1, column 12.  E
> >> > ncountered:  after : "\"Stara"
> >> > at
> >> >
> >>
> >>
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:179)
> >> > at
> >> >
> >>
> >>
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:193)
> >> > at
> >> >
> >>
> >>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> >> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1916)
> >> > at
> >> >
> >>
> >>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:780)
> >> > at
> >> >
> >>
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> >> > at
> >> >
> >>
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
> >> > at
> >> >
> >>
> >>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> >> > at
> >> >
> >>
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> >> > at
> >> >
> >>
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> >> > at
> >> >
> >>
> >>
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
> >> > at
> >> >
> >>
> >>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> >> > at
> >> >
> >>
> >>
> org.eclipse.jetty.server.han

Re: Limit on # of collections -SolrCloud

2014-03-21 Thread Chris W
Thanks Tim. I would definitely try that next time. I have seen a few
instances where the overseer_queue not getting processed but that looks
like an existing bug which got fixed in 4.6 (overseer doesnt process
requests when reload collection fails)

One question: Assuming our cluster can tolerate downtime of about 10-15
minutes, is it ok to restart all solrnodes at the same time? or will there
be race conditions while recovery?




On Fri, Mar 21, 2014 at 11:08 AM, Mark Miller  wrote:

>
> On March 21, 2014 at 1:46:13 PM, Tim Potter (tim.pot...@lucidworks.com)
> wrote:
>
> We've seen instances where you end up restarting the overseer node each
> time as you restart the cluster, which causes all kinds of craziness.
>
>
> That would be a great test to add tot he suite.
>
> --
> Mark Miller
> about.me/markrmiller
>
>


-- 
Best
-- 
C


Re: join and filter query with AND

2014-03-21 Thread Yonik Seeley
Correct.  This is only a limitation of embedding a local-params style
subquery within lucene syntax.
The parser, not knowing the syntax of the embedded query, currently
assumes the query text ends at whitespace or other special punctuation
such as ")".

Original:
(({!join from=inner_id to=outer_id fromIndex=othercore}city:"Stara
Zagora")) AND (prod:214)

Some possible workarounds that should work:
&q={!join from=inner_id to=outer_id fromIndex=othercore}city:"Stara Zagora"
&fq=prod:214

&q=({!join from=inner_id to=outer_id fromIndex=othercore
v='city:"Stara Zagora"'} AND prod:214)

&q=({!join from=inner_id to=outer_id fromIndex=othercore v=$jq} AND prod:214)
&jq=city:"Stara Zagora"


-Yonik
http://heliosearch.org - solve Solr GC pauses with off-heap filters
and fieldcache


On Fri, Mar 21, 2014 at 1:54 PM, Jack Krupansky  wrote:
> I suspect that this is a bug in the implementation of the parsing of
> embedded nested query parsers . That's a fairly new feature compared to
> non-embedded nested query parsers - maybe Yonik could shed some light. This
> may date from when he made a copy of the Lucene query parser for Solr and
> added the parsing of embedded nested query parsers to the grammar. It seems
> like the embedded nested query parser is only being applied to a single,
> white space-delimited term, and not respecting the fact that the term is a
> quoted phrase.
>
> -- Jack Krupansky
>
> -Original Message- From: Marcin Rzewucki
> Sent: Thursday, March 20, 2014 5:19 AM
> To: solr-user@lucene.apache.org
> Subject: Re: join and filter query with AND
>
>
> Nope. There is no line break in the string and it is not feed from file.
> What else could be the reason ?
>
>
>
> On 19 March 2014 17:57, Erick Erickson  wrote:
>
>> It looks to me like you're feeding this from some
>> kind of text file and you really _do_ have a
>> line break after "Stara
>>
>> Or have a line break in the string you paste into the URL
>> or something similar.
>>
>> Kind of shooting in the dark though.
>>
>> Erick
>>
>> On Wed, Mar 19, 2014 at 8:48 AM, Marcin Rzewucki 
>> wrote:
>> > Hi,
>> >
>> > I have the following issue with join query parser and filter query. For
>> > such query:
>> >
>> > *:*
>> > 
>> > (({!join from=inner_id to=outer_id fromIndex=othercore}city:"Stara
>> > Zagora")) AND (prod:214)
>> > 
>> >
>> > I got error:
>> > 
>> > 
>> > org.apache.solr.search.SyntaxError: Cannot parse 'city:"Stara': Lexical
>> > error at line 1, column 12. Encountered:  after : "\"Stara"
>> > 
>> > 400
>> > 
>> >
>> > Stack:
>> > DEBUG - 2014-03-19 13:35:20.825;
>> org.eclipse.jetty.servlet.ServletHandler;
>> > chain=SolrRequestFilter->default
>> > DEBUG - 2014-03-19 13:35:20.826;
>> > org.eclipse.jetty.servlet.ServletHandler$CachedChain; call filter
>> > SolrRequestFilter
>> > ERROR - 2014-03-19 13:35:20.828; org.apache.solr.common.SolrException;
>> > org.apache.solr.common.SolrException: >
>> > org.apache.solr.search.SyntaxError:
>> > Cannot parse 'city:"Stara': Lexical error at line 1, column 12.  E
>> > ncountered:  after : "\"Stara"
>> > at
>> >
>>
>> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:179)
>> > at
>> >
>>
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:193)
>> > at
>> >
>>
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1916)
>> > at
>> >
>>
>> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:780)
>> > at
>> >
>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
>> > at
>> >
>>
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
>> > at
>> >
>>
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
>> > at
>> >
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
>> > at
>> >
>>
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
>> > at
>> >
>>
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
>> > at
>> >
>>
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
>> > at
>> >
>>
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
>> > at
>> >
>> > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
>> > at
>> >
>>
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
>> > at
>> >
>>
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
>> > at
>> >
>>
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
>> > at
>> >
>>
>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
>> > 

RE: Limit on # of collections -SolrCloud

2014-03-21 Thread Mark Miller

On March 21, 2014 at 1:46:13 PM, Tim Potter (tim.pot...@lucidworks.com) wrote:

We've seen instances where you end up restarting the overseer node each time as 
you restart the cluster, which causes all kinds of craziness. 


That would be a great test to add tot he suite.

-- 
Mark Miller
about.me/markrmiller



Re: join and filter query with AND

2014-03-21 Thread Jack Krupansky
I suspect that this is a bug in the implementation of the parsing of 
embedded nested query parsers . That's a fairly new feature compared to 
non-embedded nested query parsers - maybe Yonik could shed some light. This 
may date from when he made a copy of the Lucene query parser for Solr and 
added the parsing of embedded nested query parsers to the grammar. It seems 
like the embedded nested query parser is only being applied to a single, 
white space-delimited term, and not respecting the fact that the term is a 
quoted phrase.


-- Jack Krupansky

-Original Message- 
From: Marcin Rzewucki

Sent: Thursday, March 20, 2014 5:19 AM
To: solr-user@lucene.apache.org
Subject: Re: join and filter query with AND

Nope. There is no line break in the string and it is not feed from file.
What else could be the reason ?



On 19 March 2014 17:57, Erick Erickson  wrote:


It looks to me like you're feeding this from some
kind of text file and you really _do_ have a
line break after "Stara

Or have a line break in the string you paste into the URL
or something similar.

Kind of shooting in the dark though.

Erick

On Wed, Mar 19, 2014 at 8:48 AM, Marcin Rzewucki 
wrote:
> Hi,
>
> I have the following issue with join query parser and filter query. For
> such query:
>
> *:*
> 
> (({!join from=inner_id to=outer_id fromIndex=othercore}city:"Stara
> Zagora")) AND (prod:214)
> 
>
> I got error:
> 
> 
> org.apache.solr.search.SyntaxError: Cannot parse 'city:"Stara': Lexical
> error at line 1, column 12. Encountered:  after : "\"Stara"
> 
> 400
> 
>
> Stack:
> DEBUG - 2014-03-19 13:35:20.825;
org.eclipse.jetty.servlet.ServletHandler;
> chain=SolrRequestFilter->default
> DEBUG - 2014-03-19 13:35:20.826;
> org.eclipse.jetty.servlet.ServletHandler$CachedChain; call filter
> SolrRequestFilter
> ERROR - 2014-03-19 13:35:20.828; org.apache.solr.common.SolrException;
> org.apache.solr.common.SolrException: 
> org.apache.solr.search.SyntaxError:

> Cannot parse 'city:"Stara': Lexical error at line 1, column 12.  E
> ncountered:  after : "\"Stara"
> at
>
org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:179)
> at
>
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:193)
> at
>
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1916)
> at
>
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:780)
> at
>
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> at
>
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
> at
>
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> at
>
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> at
>
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> at
>
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
> at
>
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> at
>
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
> at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> at
>
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> at
>
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
> at
>
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> at
>
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
> at
>
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
> at
>
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
> at org.eclipse.jetty.server.Server.handle(Server.java:364)
> at
>
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
> at
>
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
> at
>
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
> at
>
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
> at
org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
> at
> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
> at
>
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
> at
>
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
> at
>
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
> at
>
org.ecl

RE: Limit on # of collections -SolrCloud

2014-03-21 Thread Tim Potter
Hi Chris,

Thanks for the link to Patrick's github (looks like some good stuff in there).

One thing to try (and this isn't the final word on this, but is helpful) is to 
go into the tree view in the Cloud panel and find out which node is hosting the 
Overseer (/overseer_elect/leader). When restarting your cluster, make sure you 
restart this node last. We've seen instances where you end up restarting the 
overseer node each time as you restart the cluster, which causes all kinds of 
craziness. I'll bet you'll see better results by doing this, but let us know 
either way.

Also, at 600 cores per machine, I had to reduce the JVM's thread stack size to 
-Xss256k as there are an extreme number of threads allocated when starting up 
that many cores.

Cheers,

Timothy Potter
Sr. Software Engineer, LucidWorks
www.lucidworks.com


From: Chris W 
Sent: Thursday, March 20, 2014 4:31 PM
To: solr-user@lucene.apache.org
Subject: Re: Limit on # of collections -SolrCloud

The replication factor is two. I have equally sharded all collections
across all nodes. We have a 6 node cluster setup. 300* 6 shards and 2
replicas per shard. I have almost 600 cores per machine

Also one fact is that my zk timeout is in the order of 2-3 minutes. I see
zk responses very slow and a lot of outstanding requests (found that out
thanks to https://github.com/phunt/)




On Thu, Mar 20, 2014 at 2:53 PM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:

> Hours sounds too long indeed.  We recently had a client with several
> thousand collections, but restart wasn't taking hours...
>
> Otis
> Solr & ElasticSearch Support
> http://sematext.com/
> On Mar 20, 2014 5:49 PM, "Erick Erickson"  wrote:
>
> > How many total replicas are we talking here?
> > As in how many shards and, for each shard,
> > how many replicas? I'm not asking for a long list
> > here, just if you have a bazillion replicas in aggregate.
> >
> > Hours is surprising.
> >
> > Best,
> > Erick
> >
> > On Thu, Mar 20, 2014 at 2:17 PM, Chris W 
> wrote:
> > > Thanks, Shalin. Making clusterstate.json on a collection basis sounds
> > > awesome.
> > >
> > >  I am not having problems with #2 . #3 is a major time hog in my
> > > environment. I have over 300 +collections and restarting the entire
> > cluster
> > > takes in the order of hours.  (2-3 hour). Can you explain more about
> the
> > > leaderVoteWait setting?
> > >
> > >
> > >
> > >
> > > On Thu, Mar 20, 2014 at 1:28 PM, Shalin Shekhar Mangar <
> > > shalinman...@gmail.com> wrote:
> > >
> > >> There are no arbitrary limits on the number of collections but yes
> > >> there are practical limits. For example, the cluster state can become
> > >> a bottleneck. There is a lot of work happening on finding and
> > >> addressing these problems. See
> > >> https://issues.apache.org/jira/browse/SOLR-5381
> > >>
> > >> Boot up time is because of:
> > >> 1) Core discovery, schema/config parsing etc
> > >> 2) Transaction log replay on startup
> > >> 3) Wait time for enough replicas to become available before leader
> > >> election happens
> > >>
> > >> You can't do much about 1 right now I think. For #2, you can keep your
> > >> transaction logs smaller by a hard commit before shutdown. For #3
> > >> there is a leaderVoteWait settings but I'd rather not touch that
> > >> unless it becomes a problem.
> > >>
> > >> On Fri, Mar 21, 2014 at 1:39 AM, Chris W 
> > wrote:
> > >> > Hi there
> > >> >
> > >> >  Is there a limit on the # of collections solrcloud can support? Can
> > >> > zk/solrcloud handle 1000s of collections?
> > >> >
> > >> > Also i see that the bootup time of solrcloud increases with increase
> > in #
> > >> > of cores. I do not have any expensive warm up queries. How do i
> > speedup
> > >> > solr startup?
> > >> >
> > >> > --
> > >> > Best
> > >> > --
> > >> > C
> > >>
> > >>
> > >>
> > >> --
> > >> Regards,
> > >> Shalin Shekhar Mangar.
> > >>
> > >
> > >
> > >
> > > --
> > > Best
> > > --
> > > C
> >
>



--
Best
--
C

Re: SolrCell and indexing HTML

2014-03-21 Thread Jack Krupansky
The extractOnly option is simply telling you what the raw metadata is, while 
normal non-extractOnly mode is indexing meta exactly as you have requested 
it to be indexed. You haven't shown us any of your parameters that describe 
how you want the metadata indexed. If you didn't specify any mapping, it was 
probably all thrown away.


Read the tutorial on Solr Cell if you are not yet aware of how to map 
metadata:

https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Solr+Cell+using+Apache+Tika

Or read that chapter in my e-book! It has lots of examples, especially for 
the various mapping parameters.


-- Jack Krupansky

-Original Message- 
From: Liz Sommers

Sent: Friday, March 21, 2014 12:56 PM
To: solr-user
Subject: SolrCell and indexing HTML

I am trying to write a POC about indexing URL's with Solr using solrJ and
solrCell.  (The code is written in groovy).

The relevant code is here

ContentStreamUpdateRequest req = new
ContentStreamUpdateRequest("/update/extract");

   req.setParam("literal.id",p.id.toString())
   req.setParam("extractOnly","true")
   URL url = new URL(p.url)
   ContentStream stream = new ContentStreamBase.URLStream(url)
   req.addContentStream(stream)

   def result = server.request(req)
   println "result: ${result}"


When I set extractOnly to true I get everything in the URL.  All the tags,
all the stylesheets.  When I set it to false I get a response that has
nothing in it except

result: {responseHeader={status=0,QTime=19}}

When I test it with the admin tools, nothing in the url has been indexed as
far as I can tell.
I know I am doing something wrong with the params, but I haven't figured
out what.  Can somebody please help me.

Thanks
Liz Sommers
lizzy...@gmail.com
lizswo...@gmail.com 



Getting 500s on distributed queries with SolrCloud

2014-03-21 Thread Ugo Matrangolo
Hi,

I have a two shard collection running and I'm getting this error on each
query:

2014-03-21 17:08:42,018 [qtp-75] ERROR
org.apache.solr.servlet.SolrDispatchFilter  -
*null:java.lang.IllegalArgumentException:
numHits must be > 0; please use TotalHitCountCollector if you just need the
total hit count*
at
org.apache.lucene.search.TopFieldCollector.create(TopFieldCollector.java:1130)
at
org.apache.lucene.search.TopFieldCollector.create(TopFieldCollector.java:1079)
at
org.apache.lucene.search.grouping.AbstractSecondPassGroupingCollector.(AbstractSecondPassGroupingCollector.java:75)
at
org.apache.lucene.search.grouping.term.TermSecondPassGroupingCollector.(TermSecondPassGroupingCollector.java:49)
at
org.apache.solr.search.grouping.distributed.command.TopGroupsFieldCommand.create(TopGroupsFieldCommand.java:129)
at
org.apache.solr.search.grouping.CommandHandler.execute(CommandHandler.java:142)
at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:387)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:214)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1916)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:780)

Note that I'm using grouping and disabling it fixed the problem.

I was aware that SolrCloud does not fully supports grouping in a
distributed setup but I was expecting incorrect results (that have to
addressed with custom hashing afaik) and not an error.

Does anyone see this error before?

Ugo


Re: SolrCell and indexing HTML

2014-03-21 Thread Greg Walters
I've never tried indexing via groovy or using solrCell but I think you might be 
working a bit too low level in solrj if you're just adding documents. You might 
try checking out https://wiki.apache.org/solr/Solrj#Adding_Data_to_Solr and I 
might be way off base :)

Thanks,
Greg

On Mar 21, 2014, at 11:56 AM, Liz Sommers  wrote:

> I am trying to write a POC about indexing URL's with Solr using solrJ and
> solrCell.  (The code is written in groovy).
> 
> The relevant code is here
> 
> ContentStreamUpdateRequest req = new
> ContentStreamUpdateRequest("/update/extract");
> 
>req.setParam("literal.id",p.id.toString())
>req.setParam("extractOnly","true")
>URL url = new URL(p.url)
>ContentStream stream = new ContentStreamBase.URLStream(url)
>req.addContentStream(stream)
> 
>def result = server.request(req)
>println "result: ${result}"
> 
> 
> When I set extractOnly to true I get everything in the URL.  All the tags,
> all the stylesheets.  When I set it to false I get a response that has
> nothing in it except
> 
> result: {responseHeader={status=0,QTime=19}}
> 
> When I test it with the admin tools, nothing in the url has been indexed as
> far as I can tell.
> I know I am doing something wrong with the params, but I haven't figured
> out what.  Can somebody please help me.
> 
> Thanks
> Liz Sommers
> lizzy...@gmail.com
> lizswo...@gmail.com



SolrCell and indexing HTML

2014-03-21 Thread Liz Sommers
I am trying to write a POC about indexing URL's with Solr using solrJ and
solrCell.  (The code is written in groovy).

The relevant code is here

ContentStreamUpdateRequest req = new
ContentStreamUpdateRequest("/update/extract");

req.setParam("literal.id",p.id.toString())
req.setParam("extractOnly","true")
URL url = new URL(p.url)
ContentStream stream = new ContentStreamBase.URLStream(url)
req.addContentStream(stream)

def result = server.request(req)
println "result: ${result}"


When I set extractOnly to true I get everything in the URL.  All the tags,
all the stylesheets.  When I set it to false I get a response that has
nothing in it except

result: {responseHeader={status=0,QTime=19}}

When I test it with the admin tools, nothing in the url has been indexed as
far as I can tell.
I know I am doing something wrong with the params, but I haven't figured
out what.  Can somebody please help me.

Thanks
Liz Sommers
lizzy...@gmail.com
lizswo...@gmail.com


Re: Bootstrapping SolrCloud cluster with multiple collections in differene sharding/replication setup

2014-03-21 Thread Ugo Matrangolo
Hi,

got a nice talk on IRC about this. The right thing to do is to start with a
clean SOLR cluster (no cores) and then create all the proper collections
with the Collections API.

Ugo


On Thu, Mar 20, 2014 at 7:26 PM, Jeff Wartes  wrote:

>
> Please note that although the article talks about the ADDREPLICA command,
> that feature is coming in Solr 4.8, so don¹t be confused if you can¹t find
> it yet. See https://issues.apache.org/jira/browse/SOLR-5130
>
>
>
> On 3/20/14, 7:45 AM, "Erick Erickson"  wrote:
>
> >You might find this useful:
> >http://heliosearch.org/solrcloud-assigning-nodes-machines/
> >
> >
> >It uses the collections API to create your collection with zero
> >nodes, then shows how to assign your leaders to specific
> >machines (well, at least specify the nodes the leaders will
> >be created on, it doesn't show how to assign, for instance,
> >shard1 to nodeX)
> >
> >It also shows a way to assign specific replicas on specific nodes
> >to specific shards, although as Mark says this is a transitional
> >technique. I know there's an "addreplica" command in the works
> >for the collections API that should make this easier, but that's
> >not released yet.
> >
> >Best,
> >Erick
> >
> >
> >On Thu, Mar 20, 2014 at 7:23 AM, Ugo Matrangolo
> > wrote:
> >> Hi,
> >>
> >> I would like some advice about the best way to bootstrap from scratch a
> >> SolrCloud cluster housing at least two collections with different
> >> sharding/replication setup.
> >>
> >> Going through the docs/'Solr In Action' book what I have sees so far is
> >> that there is a way to bootstrap a SolrCloud cluster with sharding
> >> configuration using the:
> >>
> >>   -DnumShards=2
> >>
> >> but this (afaik) works only for a single collection. What I need is a
> >>way
> >> to deploy from scratch a SolrCloud cluster housing (e.g.) two
> >>collections
> >> Foo and Bar where Foo has only one shard and is replicated everywhere
> >>while
> >> Bar has three shards and ,again, is replicated.
> >>
> >> I can't find a config file where to put this sharding plan and I'm
> >>starting
> >> to think that the only way to do this is after the deploy using the
> >> Collections API.
> >>
> >> Is there a best approach way to do this ?
> >>
> >> Ugo
>
>


using SolrJ with SolrCloud, searching multiple indexes.

2014-03-21 Thread Russell Taylor
Hi,
just started to move my SolrJ queries over to our SolrCloud  environment and I  
want to know how to do a query  where you combine multiple indexes.

Previously I had a string called shards which links all the indexes together 
and adds them to the query.
String shards = 
"server:8080/solr_search/bonds,server:8080/solr_search/equities,etc"
which I add to my SolrQuery
solrQuery.add("shards",shards);
I can then search across many indexes.

In SolrCloud we do this
CloudSolrServer server = new 
CloudSolrServer("solrServer1:2111,solrServer2:2111,solrServer2:2111");
and add the default collection
server.setDefaultCollection(bonds);

How do I add the other indexes to my query in CloudSolrServer? If it's as 
before solrQuery.add("shards",shards); how do I find out the address of the 
machine CloudSolrServer has chosen?



Thanks


Russ.


***
This message (including any files transmitted with it) may contain confidential 
and/or proprietary information, is the property of Interactive Data Corporation 
and/or its subsidiaries, and is directed only to the addressee(s). If you are 
not the designated recipient or have reason to believe you received this 
message in error, please delete this message from your system and notify the 
sender immediately. An unintended recipient's disclosure, copying, 
distribution, or use of this message or any attachments is prohibited and may 
be unlawful. 
***


Re: join and filter query with AND

2014-03-21 Thread Kranti Parisa
You may try this

(({!join from=inner_id to=outer_id fromIndex=othercore v=$joinQuery}

And pass another parameter joinQuery=(city:"Stara Zagora" AND prod:214)

Thanks,
Kranti K. Parisa
http://www.linkedin.com/in/krantiparisa



On Fri, Mar 21, 2014 at 4:47 AM, Marcin Rzewucki wrote:

> Hi,
>
> Erick, I do not get your point. What kind of servlet container settings do
> you mean and why do you think they might be related ? I'm using Jetty and
> never set any limit for packet size. My query does not work only in case of
> double quotes and space between words. Why? It works in other cases as
> described in my first mail.
>
> Cheers.
>
>
>
> On 20 March 2014 15:23, Erick Erickson  wrote:
>
> > Well, the error message really looks like your input is
> > getting chopped off.
> >
> > It's vaguely possible that you have some super-low limit
> > in your servlet container configuration that is only letting very
> > small packets through.
> >
> > What I'd do is look in the Solr log file to see exactly what
> > is coming through. Because regardless of what you _think_
> > you're sending, it _really_ looks like Solr is getting the fq
> > clause with something that breaks it up. So I'd like to
> > absolutely nail that as being wrong before speculating.
> >
> > Because I can cut/paste your fq clause just fine. Of course
> > it fails because I don't have the other core defined, but that
> > means the query has made it through query parsing while
> > yours hasn't in your setup.
> >
> > Best,
> > Erick
> >
> > On Thu, Mar 20, 2014 at 2:19 AM, Marcin Rzewucki 
> > wrote:
> > > Nope. There is no line break in the string and it is not feed from
> file.
> > > What else could be the reason ?
> > >
> > >
> > >
> > > On 19 March 2014 17:57, Erick Erickson 
> wrote:
> > >
> > >> It looks to me like you're feeding this from some
> > >> kind of text file and you really _do_ have a
> > >> line break after "Stara
> > >>
> > >> Or have a line break in the string you paste into the URL
> > >> or something similar.
> > >>
> > >> Kind of shooting in the dark though.
> > >>
> > >> Erick
> > >>
> > >> On Wed, Mar 19, 2014 at 8:48 AM, Marcin Rzewucki  >
> > >> wrote:
> > >> > Hi,
> > >> >
> > >> > I have the following issue with join query parser and filter query.
> > For
> > >> > such query:
> > >> >
> > >> > *:*
> > >> > 
> > >> > (({!join from=inner_id to=outer_id fromIndex=othercore}city:"Stara
> > >> > Zagora")) AND (prod:214)
> > >> > 
> > >> >
> > >> > I got error:
> > >> > 
> > >> > 
> > >> > org.apache.solr.search.SyntaxError: Cannot parse 'city:"Stara':
> > Lexical
> > >> > error at line 1, column 12. Encountered:  after : "\"Stara"
> > >> > 
> > >> > 400
> > >> > 
> > >> >
> > >> > Stack:
> > >> > DEBUG - 2014-03-19 13:35:20.825;
> > >> org.eclipse.jetty.servlet.ServletHandler;
> > >> > chain=SolrRequestFilter->default
> > >> > DEBUG - 2014-03-19 13:35:20.826;
> > >> > org.eclipse.jetty.servlet.ServletHandler$CachedChain; call filter
> > >> > SolrRequestFilter
> > >> > ERROR - 2014-03-19 13:35:20.828;
> org.apache.solr.common.SolrException;
> > >> > org.apache.solr.common.SolrException:
> > org.apache.solr.search.SyntaxError:
> > >> > Cannot parse 'city:"Stara': Lexical error at line 1, column 12.  E
> > >> > ncountered:  after : "\"Stara"
> > >> > at
> > >> >
> > >>
> >
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:179)
> > >> > at
> > >> >
> > >>
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:193)
> > >> > at
> > >> >
> > >>
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> > >> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1916)
> > >> > at
> > >> >
> > >>
> >
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:780)
> > >> > at
> > >> >
> > >>
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> > >> > at
> > >> >
> > >>
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
> > >> > at
> > >> >
> > >>
> >
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> > >> > at
> > >> >
> > >>
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> > >> > at
> > >> >
> > >>
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> > >> > at
> > >> >
> > >>
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
> > >> > at
> > >> >
> > >>
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> > >> > at
> > >> >
> > >>
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
> > >> > at
> > >> >
> > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> > >> > at
> > >> >
> > >>
> >
> org.eclips

Re: Is there a Field/FieldType property to enable storing of term vector payloads?

2014-03-21 Thread Daniel Jamrog
Doug, thanks for the quick reply!

I have a separate question:

The FieldType I was using when I ran into this happened to be a
PreAnalyzedField.  I tried to specify an alternate parser to be used
instead of the default JsonPreAnalyzedParser.  To do this, I added the
parameter parserImpl to my FieldType definition in schema.xml, but this
produced an "invalid arguments" exception to be thrown by the FieldType
class.  It seems that PreAnalyzedField reads the parserImpl arg, but
doesn't remove it from the args map.  Did I do something wrong here or is
this a bug?

Thanks,
Dan


On Fri, Mar 21, 2014 at 11:18 AM, Doug Turnbull <
dturnb...@opensourceconnections.com> wrote:

> Daniel, I had a similar issue. The option is not in the normal FieldType,
> so I had to create my own FieldType in a Solr plugin that enabled payloads
> in term vectors. This mostly involved extending TextField, copy pasting
> "createField" from FieldType (TextField's parent class) and adding the one
> line
>
> public class PayloadTextField extends TextField {
> @Override
> public IndexableField createField(SchemaField field, Object value, float
> boost) {
> // copy-paste from FieldType.createField
> //...
> // this is the one-and-only line we change from what we inherit
> newType.setStoreTermVectorPayloads(field.storeTermVector()); return
> createField(field.getName(), val, newType, boost);
> }
>
> }
>
> Unfortunately there's no method to override to redifine the Lucene field
> type that's used (newType) so you have to copy-paste this whole thing.
>
>
> On Fri, Mar 21, 2014 at 9:53 AM, Daniel Jamrog 
> wrote:
>
> > I see properties to enable term vectors, positions and offsets, but
> didn't
> > find one for payloads?  Did I just miss it?   If not, is this something
> > that may be added in the future?
> >
> > Thanks
> >
>
>
>
> --
> Doug Turnbull
> Search & Big Data Architect
> OpenSource Connections 
>


Re: Is there a Field/FieldType property to enable storing of term vector payloads?

2014-03-21 Thread Doug Turnbull
Daniel, I had a similar issue. The option is not in the normal FieldType,
so I had to create my own FieldType in a Solr plugin that enabled payloads
in term vectors. This mostly involved extending TextField, copy pasting
"createField" from FieldType (TextField's parent class) and adding the one
line

public class PayloadTextField extends TextField {
@Override
public IndexableField createField(SchemaField field, Object value, float
boost) {
// copy-paste from FieldType.createField
//...
// this is the one-and-only line we change from what we inherit
newType.setStoreTermVectorPayloads(field.storeTermVector()); return
createField(field.getName(), val, newType, boost);
}

}

Unfortunately there's no method to override to redifine the Lucene field
type that's used (newType) so you have to copy-paste this whole thing.


On Fri, Mar 21, 2014 at 9:53 AM, Daniel Jamrog  wrote:

> I see properties to enable term vectors, positions and offsets, but didn't
> find one for payloads?  Did I just miss it?   If not, is this something
> that may be added in the future?
>
> Thanks
>



-- 
Doug Turnbull
Search & Big Data Architect
OpenSource Connections 


Re: Solr4.7 No live SolrServers available to handle this request

2014-03-21 Thread Michael Sokolov
I just managed to track this down -- as you said the disconnect was a 
red herring.


Ultimately the problem was caused by a custom analysis component we 
wrote that was raising an IOException -- it was missing some 
configuration files it relies on.


What might be interesting for solr devs to have a look at is that 
exception was completely swallowed by JavabinCodec, making it very 
difficult to track down the problem.  Furthermore -- if the /add request 
was routed directly to the shard where the document was destined to end 
up, then the IOException raised by the analysis component (a char 
filter) showed up in the Solr HTTP response (probably because my client 
used XML format in one test -- javabin is used internally in 
SolrCloud).  But if the request was routed to a different shard, then 
the only exception that showed up anywhere (in the logs, in the HTTP 
response) was kind of irrelevant.


I think this could be fixed pretty easily; see SOLR-5985 for my suggestion.

-Mike


On 03/21/2014 10:20 AM, Greg Walters wrote:

Broken pipe errors are generally caused by unexpected disconnections and are some times 
hard to track down. Given the stack traces you've provided it's hard to point to any one 
thing and I suspect the relevant information was snipped out in the "long dump of 
document fields". You might grab the entire error from the client you're uploading 
documents with, the server you're connected to and any other nodes that have an error at 
the same time and put it on pastebin or the like.

Thanks,
Greg

On Mar 20, 2014, at 3:36 PM, Michael Sokolov  
wrote:


I'm getting a similar exception when writing documents (on the client side).  I 
can write one document fine, but the second (which is being routed to a 
different shard) generates the error.  It happens every time - definitely not a 
resource issue or timing problem since this database is completely empty -- I'm 
just getting started and running some tests, so there must be some kind of 
setup problem.  But it's difficult to diagnose (for me, anyway)!  I'd 
appreciate any insight, hints, guesses, etc. since I'm stuck. Thanks!

One node (the leader?) is reporting "Internal Server Error" in its log, and 
another node (presumably the shard where the document is being directed) bombs out like 
this:

ERROR - 2014-03-20 15:56:53.022; org.apache.solr.common.SolrException; 
null:org.apache.solr.common.SolrException: ERROR adding document 
SolrInputDocument(

... long dump of document fields

)
at 
org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:99)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:166)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:225)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:190)
at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:116)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:173)
at 
org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:106)
at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:58)
at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:721)
...
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at 
org.apache.coyote.http11.InternalOutputBuffer.realWriteBytes(InternalOutputBuffer.java:215)
at org.apache.tomcat.util.buf.ByteChunk.flushBuffer(ByteChunk.java:480)
at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:366)
at 
org.apache.coyote.http11.InternalOutputBuffer$OutputStreamOutputBuffer.doWrite(InternalOutputBuffer.java:240)
at 
org.apache.coyote.http11.filters.ChunkedOutputFilter.doWrite(ChunkedOutputFilter.java:119)
at 
org.apache.coyote.http11.AbstractOutputBuffer.doWrite(AbstractOutputBuffer.java:192)
at org.apache.coyote.Response.doWrite(Response.java:520)
at 
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:408)
... 37 more

This is with Solr 4.6.1

Re: solr cloud distributed optimize() becomes serialized

2014-03-21 Thread Mark Miller
Recently fixed in Lucene - should be able to find the issue if you dig a little.
-- 
Mark Miller
about.me/markrmiller

On March 21, 2014 at 10:25:56 AM, Greg Walters (greg.walt...@answers.com) wrote:

I've seen this on 4.6.  

Thanks,  
Greg  

On Mar 20, 2014, at 11:58 PM, Shalin Shekhar Mangar  
wrote:  

> That's not right. Which Solr versions are you on (question for both  
> William and Chris)?  
>  
> On Fri, Mar 21, 2014 at 8:07 AM, William Bell  wrote:  
>> Yeah. optimize() also used to come back immediately if the index was  
>> already indexed. It just reopened the index.  
>>  
>> We uses to use that for cleaning up the old directories quickly. But now it  
>> does another optimize() even through the index is already optimized.  
>>  
>> Very strange.  
>>  
>>  
>> On Tue, Mar 18, 2014 at 11:30 AM, Chris Lu  wrote:  
>>  
>>> I wonder whether this is a known bug. In previous SOLR cloud versions, 4.4  
>>> or maybe 4.5, an explicit optimize(), without any parameters, it usually  
>>> took 2 minutes for a 32 core cluster.  
>>>  
>>> However, in 4.6.1, the same call took about 1 hour. Checking the index  
>>> modification time for each core shows 2 minutes gap if sorted.  
>>>  
>>> We are using a solrj client connecting to zookeeper. I found it is talking  
>>> to a specific solr server A, and that server A is distributing the calls to 
>>>  
>>> all other solr servers. Here is the thread dump for this server A:  
>>>  
>>> at  
>>>  
>>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:395)
>>>   
>>> at  
>>>  
>>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:199)
>>>   
>>> at  
>>>  
>>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.request(ConcurrentUpdateSolrServer.java:293)
>>>   
>>> at  
>>>  
>>> org.apache.solr.update.SolrCmdDistributor.submit(SolrCmdDistributor.java:226)
>>>   
>>> at  
>>>  
>>> org.apache.solr.update.SolrCmdDistributor.distribCommit(SolrCmdDistributor.java:195)
>>>   
>>> at  
>>>  
>>> org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1250)
>>>   
>>> at  
>>>  
>>> org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69)
>>>   
>>>  
>>  
>>  
>>  
>> --  
>> Bill Bell  
>> billnb...@gmail.com  
>> cell 720-256-8076  
>  
>  
>  
> --  
> Regards,  
> Shalin Shekhar Mangar.  



How to stop backup once initiated

2014-03-21 Thread search engn dev
My index size 20 GB and I have issues solr backup command , now this backup
is going on its taking too much time , so how can i stop backup command?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-stop-backup-once-initiated-tp4126020.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr cloud distributed optimize() becomes serialized

2014-03-21 Thread Greg Walters
I've seen this on 4.6.

Thanks,
Greg

On Mar 20, 2014, at 11:58 PM, Shalin Shekhar Mangar  
wrote:

> That's not right. Which Solr versions are you on (question for both
> William and Chris)?
> 
> On Fri, Mar 21, 2014 at 8:07 AM, William Bell  wrote:
>> Yeah. optimize() also used to come back immediately if the index was
>> already indexed. It just reopened the index.
>> 
>> We uses to use that for cleaning up the old directories quickly. But now it
>> does another optimize() even through the index is already optimized.
>> 
>> Very strange.
>> 
>> 
>> On Tue, Mar 18, 2014 at 11:30 AM, Chris Lu  wrote:
>> 
>>> I wonder whether this is a known bug. In previous SOLR cloud versions, 4.4
>>> or maybe 4.5, an explicit optimize(), without any parameters, it usually
>>> took 2 minutes for a 32 core cluster.
>>> 
>>> However, in 4.6.1, the same call took about 1 hour. Checking the index
>>> modification time for each core shows 2 minutes gap if sorted.
>>> 
>>> We are using a solrj client connecting to zookeeper. I found it is talking
>>> to a specific solr server A, and that server A is distributing the calls to
>>> all other solr servers. Here is the thread dump for this server A:
>>> 
>>> at
>>> 
>>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:395)
>>> at
>>> 
>>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:199)
>>> at
>>> 
>>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer.request(ConcurrentUpdateSolrServer.java:293)
>>> at
>>> 
>>> org.apache.solr.update.SolrCmdDistributor.submit(SolrCmdDistributor.java:226)
>>> at
>>> 
>>> org.apache.solr.update.SolrCmdDistributor.distribCommit(SolrCmdDistributor.java:195)
>>> at
>>> 
>>> org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:1250)
>>> at
>>> 
>>> org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69)
>>> 
>> 
>> 
>> 
>> --
>> Bill Bell
>> billnb...@gmail.com
>> cell 720-256-8076
> 
> 
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.



Re: Solr4.7 No live SolrServers available to handle this request

2014-03-21 Thread Greg Walters
Broken pipe errors are generally caused by unexpected disconnections and are 
some times hard to track down. Given the stack traces you've provided it's hard 
to point to any one thing and I suspect the relevant information was snipped 
out in the "long dump of document fields". You might grab the entire error from 
the client you're uploading documents with, the server you're connected to and 
any other nodes that have an error at the same time and put it on pastebin or 
the like.

Thanks,
Greg

On Mar 20, 2014, at 3:36 PM, Michael Sokolov  
wrote:

> I'm getting a similar exception when writing documents (on the client side).  
> I can write one document fine, but the second (which is being routed to a 
> different shard) generates the error.  It happens every time - definitely not 
> a resource issue or timing problem since this database is completely empty -- 
> I'm just getting started and running some tests, so there must be some kind 
> of setup problem.  But it's difficult to diagnose (for me, anyway)!  I'd 
> appreciate any insight, hints, guesses, etc. since I'm stuck. Thanks!
> 
> One node (the leader?) is reporting "Internal Server Error" in its log, and 
> another node (presumably the shard where the document is being directed) 
> bombs out like this:
> 
> ERROR - 2014-03-20 15:56:53.022; org.apache.solr.common.SolrException; 
> null:org.apache.solr.common.SolrException: ERROR adding document 
> SolrInputDocument(
> 
> ... long dump of document fields
> 
> )
>at 
> org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:99)
>at 
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:166)
>at 
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:136)
>at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:225)
>at 
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121)
>at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:190)
>at 
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:116)
>at 
> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:173)
>at 
> org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:106)
>at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:58)
>at 
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
>at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
>at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:721)
> ...
> Caused by: java.net.SocketException: Broken pipe
>at java.net.SocketOutputStream.socketWrite0(Native Method)
>at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
>at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
>at 
> org.apache.coyote.http11.InternalOutputBuffer.realWriteBytes(InternalOutputBuffer.java:215)
>at org.apache.tomcat.util.buf.ByteChunk.flushBuffer(ByteChunk.java:480)
>at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:366)
>at 
> org.apache.coyote.http11.InternalOutputBuffer$OutputStreamOutputBuffer.doWrite(InternalOutputBuffer.java:240)
>at 
> org.apache.coyote.http11.filters.ChunkedOutputFilter.doWrite(ChunkedOutputFilter.java:119)
>at 
> org.apache.coyote.http11.AbstractOutputBuffer.doWrite(AbstractOutputBuffer.java:192)
>at org.apache.coyote.Response.doWrite(Response.java:520)
>at 
> org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:408)
>... 37 more
> 
> This is with Solr 4.6.1, Tomcat 7.  Here's my clusterstate.json. Updates are 
> being sent to the test1x3 collection
> 
> 
> {
>  "test3x1":{
>"shards":{
>  "shard1":{
>"range":"8000-d554",
>"state":"active",
>"replicas":{"core_node1":{
>"state":"active",
>"base_url":"http://10.4.24.37:8080/solr";,
>"core":"test3x1_shard1_replica1",
>"node_name":"10.4.24.37:8080_solr",
>"leader":"true"}}},
>  "shard2":{
>"range":"d555-2aa9",
>"state":"active",
>"replicas":{"core_node3":{
>"state":"active",
>"base_url":"http://10.4.24.39:8080/solr";,
>"core":"test3x1_shard2_replica1",
>"node_name":"10.4.24.39:8080_solr",
>"leader":"true"}}},
>  "shard3":{
>"range":"2aaa-7fff",
>"state":"active",
>"replicas":{"core_node2":{
>"

Is there a Field/FieldType property to enable storing of term vector payloads?

2014-03-21 Thread Daniel Jamrog
I see properties to enable term vectors, positions and offsets, but didn't
find one for payloads?  Did I just miss it?   If not, is this something
that may be added in the future?

Thanks


Re: SOLR synonyms - Explicit mappings

2014-03-21 Thread Nicole Lacoste
That looks right.  Have you mistakenly added the synonym filter on the
indexing side as well?  You can use the solr admin analysis page (maybe at
http://localhost:8983/solr/#/collection1/analysis)  to debug.

Niki



On 21 March 2014 00:03, bbi123  wrote:

> I need some clarification of how to define explicit mappings in
> synonyms.txt
> file.
>
> I have been using equivalent synonyms for a while and it works as expected.
>
> I am confused with explicit mapping.
>
> I have the below synonyms added to query analyzer.
>
> I want the search on keyword 'watch' to actually do a search on
> 'smartwatch'
> but the below query mapping seems to bring the documents that contain both
> keywords 'watch' and 'smartwatch'.. Am I doing anything wrong?
>
> watch => smartwatch
>
> Thanks for your help!!!
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SOLR-synonyms-Explicit-mappings-tp4125858.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
* *


Re: Best SSD block size for large SOLR indexes

2014-03-21 Thread Salman Akram
For now I am going with 64kb and results seem good. Thanks for the useful
feedback.


On Wed, Mar 19, 2014 at 9:30 PM, Shawn Heisey  wrote:

> On 3/19/2014 12:09 AM, Salman Akram wrote:
>
>> Thanks for the info. The articles were really useful but still seems I
>> have
>> to do my own testing to find the right page size? I thought for large
>> indexes there would already be some tests done in SOLR community.
>>
>> Side note: We are heavily using Microsoft technology (.NET etc) for
>> development so by looking at all the pros/cons decided to stick with
>> Windows. Wasn't rude ;)
>>
>
> Assuming you are only going to be putting Solr data on it, or anything
> else you put on it will also consist of large files, I would probably go
> with a cluster size at least 64KB for an NTFS volume, and I might consider
> 128KB or 256KB.  There *ARE* a few small files in a Solr index, but not
> enough of them for the wasted space to become a problem.
>
> The easiest way to configure Solr to use a different location than the
> program directory is to change the solr home.
>
> Thanks,
> Shawn
>
>


-- 
Regards,

Salman Akram


Re: Rounding errors with SOLR score

2014-03-21 Thread Raymond Wiker
Are you sure that SOLR is rounding incorrectly, and not simply differently
from what you expect? I was surprised myself at some of the rounding
behaviour I saw with SOLR, but according to
http://en.wikipedia.org/wiki/Rounding , the results were valid (just not
the round-up-from-half that I naively expected).


On Fri, Mar 21, 2014 at 3:27 AM, William Bell  wrote:

> When doing complex boosting/bq we are getting rounding errors on the score.
>
> To get the score to be consistent I needed to use rint on sort:
>
> sort=rint(product(sum($p_score,$s_score,$q_score),100)) desc,s_query asc
>
> recip(priority,1,.5,.01)
> product(recip(synonym_rank,1,1,.01),17)
> 
> query({!dismax qf="user_query_edge^1 user_query^0.5 user_query_fuzzy"
> v=$q1})
> 
>
> The issue is in the qf area.
>
> {"s_query": "Ear Irrigation","score": 10.331313},{"s_query": "Ear
> Piercing",
> "score": 10.331314},{"s_query": "Ear Pinning","score": 10.331313},
>
> --
> Bill Bell
> billnb...@gmail.com
> cell 720-256-8076
>


Re: understand debuginfo from query

2014-03-21 Thread aowen
i found a good page to explain the debug output but it is still unclear for me. 
why is the field plain_text not worth anything? the query term was found 3 
times.

you can see it here: http://explain.solr.pl/explains/a90aze3o



 ao...@hispeed.ch schrieb:
> i want the infos simplified so that the user can see why a doc was found
> 
> bellow is the output a a doc:
> 
> 0.085597195 = (MATCH) sum of:
>   0.083729245 = (MATCH) max of:
> 0.0019158133 = (MATCH) weight(plain_text:test^10.0 in 601) 
> [DefaultSimilarity], result of:
>   0.0019158133 = score(doc=601,freq=9.0 = termFreq=9.0
> ), product of:
> 0.022560213 = queryWeight, product of:
>   10.0 = boost
>   3.6232536 = idf(docFreq=81, maxDocs=1130)
>   6.2265067E-4 = queryNorm
> 0.084920004 = fieldWeight in 601, product of:
>   3.0 = tf(freq=9.0), with freq of:
> 9.0 = termFreq=9.0
>   3.6232536 = idf(docFreq=81, maxDocs=1130)
>   0.0078125 = fieldNorm(doc=601)
> 0.083729245 = (MATCH) weight(inhaltstyp:test^6.0 in 601) 
> [DefaultSimilarity], result of:
>   0.083729245 = score(doc=601,freq=1.0 = termFreq=1.0
> ), product of:
> 0.017686278 = queryWeight, product of:
>   6.0 = boost
>   4.734136 = idf(docFreq=26, maxDocs=1130)
>   6.2265067E-4 = queryNorm
> 4.734136 = fieldWeight in 601, product of:
>   1.0 = tf(freq=1.0), with freq of:
> 1.0 = termFreq=1.0
>   4.734136 = idf(docFreq=26, maxDocs=1130)
>   1.0 = fieldNorm(doc=601)
> 0.013458222 = (MATCH) weight(title:test^20.0 in 601) [DefaultSimilarity], 
> result of:
>   0.013458222 = score(doc=601,freq=1.0 = termFreq=1.0
> ), product of:
> 0.042281017 = queryWeight, product of:
>   20.0 = boost
>   3.395244 = idf(docFreq=102, maxDocs=1130)
>   6.2265067E-4 = queryNorm
> 0.31830412 = fieldWeight in 601, product of:
>   1.0 = tf(freq=1.0), with freq of:
> 1.0 = termFreq=1.0
>   3.395244 = idf(docFreq=102, maxDocs=1130)
>   0.09375 = fieldNorm(doc=601)
>   0.001867952 = (MATCH) product of:
> 0.003735904 = (MATCH) sum of:
>   0.003735904 = (MATCH) ConstantScore(expiration:[1395328539325 TO *]), 
> product of:
> 1.0 = boost
> 0.003735904 = queryNorm
> 0.5 = coord(1/2)
>   0.0 = (MATCH) FunctionQuery(div(int(clicks),max(int(displays),const(1, 
> product of:
> 0.0 = div(int(clicks)=0,max(int(displays)=432,const(1)))
> 8.0 = boost
> 6.2265067E-4 = queryNorm 
> 
> 
> why is the sum 0.085597195? this would mean 0.083729245 + 0.001867952 and 
> these are not included in the sum: 0.0019158133 + 0.013458222  + 0.003735904 
> 
> am i looking at the wrong total?
> aren't these 2 cases the ones i have to sum up "x = (MATCH) sum of" or x = 
> score(" ?
> 
> i'm trying to extract the fields that where used for weighing the doc.
> 



Re: join and filter query with AND

2014-03-21 Thread Marcin Rzewucki
Hi,

Erick, I do not get your point. What kind of servlet container settings do
you mean and why do you think they might be related ? I'm using Jetty and
never set any limit for packet size. My query does not work only in case of
double quotes and space between words. Why? It works in other cases as
described in my first mail.

Cheers.



On 20 March 2014 15:23, Erick Erickson  wrote:

> Well, the error message really looks like your input is
> getting chopped off.
>
> It's vaguely possible that you have some super-low limit
> in your servlet container configuration that is only letting very
> small packets through.
>
> What I'd do is look in the Solr log file to see exactly what
> is coming through. Because regardless of what you _think_
> you're sending, it _really_ looks like Solr is getting the fq
> clause with something that breaks it up. So I'd like to
> absolutely nail that as being wrong before speculating.
>
> Because I can cut/paste your fq clause just fine. Of course
> it fails because I don't have the other core defined, but that
> means the query has made it through query parsing while
> yours hasn't in your setup.
>
> Best,
> Erick
>
> On Thu, Mar 20, 2014 at 2:19 AM, Marcin Rzewucki 
> wrote:
> > Nope. There is no line break in the string and it is not feed from file.
> > What else could be the reason ?
> >
> >
> >
> > On 19 March 2014 17:57, Erick Erickson  wrote:
> >
> >> It looks to me like you're feeding this from some
> >> kind of text file and you really _do_ have a
> >> line break after "Stara
> >>
> >> Or have a line break in the string you paste into the URL
> >> or something similar.
> >>
> >> Kind of shooting in the dark though.
> >>
> >> Erick
> >>
> >> On Wed, Mar 19, 2014 at 8:48 AM, Marcin Rzewucki 
> >> wrote:
> >> > Hi,
> >> >
> >> > I have the following issue with join query parser and filter query.
> For
> >> > such query:
> >> >
> >> > *:*
> >> > 
> >> > (({!join from=inner_id to=outer_id fromIndex=othercore}city:"Stara
> >> > Zagora")) AND (prod:214)
> >> > 
> >> >
> >> > I got error:
> >> > 
> >> > 
> >> > org.apache.solr.search.SyntaxError: Cannot parse 'city:"Stara':
> Lexical
> >> > error at line 1, column 12. Encountered:  after : "\"Stara"
> >> > 
> >> > 400
> >> > 
> >> >
> >> > Stack:
> >> > DEBUG - 2014-03-19 13:35:20.825;
> >> org.eclipse.jetty.servlet.ServletHandler;
> >> > chain=SolrRequestFilter->default
> >> > DEBUG - 2014-03-19 13:35:20.826;
> >> > org.eclipse.jetty.servlet.ServletHandler$CachedChain; call filter
> >> > SolrRequestFilter
> >> > ERROR - 2014-03-19 13:35:20.828; org.apache.solr.common.SolrException;
> >> > org.apache.solr.common.SolrException:
> org.apache.solr.search.SyntaxError:
> >> > Cannot parse 'city:"Stara': Lexical error at line 1, column 12.  E
> >> > ncountered:  after : "\"Stara"
> >> > at
> >> >
> >>
> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:179)
> >> > at
> >> >
> >>
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:193)
> >> > at
> >> >
> >>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> >> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1916)
> >> > at
> >> >
> >>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:780)
> >> > at
> >> >
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:427)
> >> > at
> >> >
> >>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
> >> > at
> >> >
> >>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> >> > at
> >> >
> >>
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> >> > at
> >> >
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> >> > at
> >> >
> >>
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
> >> > at
> >> >
> >>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
> >> > at
> >> >
> >>
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
> >> > at
> >> >
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> >> > at
> >> >
> >>
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> >> > at
> >> >
> >>
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
> >> > at
> >> >
> >>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> >> > at
> >> >
> >>
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
> >> > at
> >> >
> >>
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
> >> > at
> >> >
> >>
> org.eclipse.jetty.server.handle

Re: SOLR & Typo3

2014-03-21 Thread Gora Mohanty
On 21 March 2014 13:54, Bernhard Prange  wrote:
> Hey Group,
> I am trying to use SOLR with TYPO3.
>
> It works so far. But I get an "?sword_list[]=endometrial&no_cache=1" on the
> end of each link, causing the linking not to work. How do I remove that? Do
> I have to configure this within RealUrl?

You should ask on a TYPO3 list: This seems to have nothing to do with Solr.

Regards,
Gora


SOLR & Typo3

2014-03-21 Thread Bernhard Prange

Hey Group,
I am trying to use SOLR with TYPO3.

It works so far. But I get an "?sword_list[]=endometrial&no_cache=1" on 
the end of each link, causing the linking not to work. How do I remove 
that? Do I have to configure this within RealUrl?


Thanks for your help.


Re: Memory + WeakIdentityMap

2014-03-21 Thread jim ferenczi
Hi,
If you are not on windows, you can try to disable the tracking of clones in
the MMapDirectory by setting unmap to false in your solrconfig.xml:


* *

*  false*
**
The MMapDirectory keeps track of all clones in a weak map and forces the
unmapping of the buffers on close. This was added because on Windows
mmapped files cannot be modified or deleted. If unmap is false the weak map
is not created and the weak references you see in your heap should disapear
as well.
You can find more informations here:
https://issues.apache.org/jira/browse/LUCENE-4740

Thanks,
Jim






2014-03-21 6:56 GMT+01:00 Shawn Heisey :

> On 3/20/2014 6:54 PM, Harish Agarwal wrote:
> > I'm transitioning my index from a 3.x version to >4.6.  I'm running a
> large
> > heap (20G), primarily to accomodate a large facet cache (~5G), but have
> > been able to run it on 3.x stably.
> >
> > On 4.6.0 after stress testing I'm finding that all of my shards are
> > spending all of their time in GC.  After taking a heap dump and
> analyzing,
> > it appears that org.apache.lucene.util.WeakIdentityMap is using many Gs
> of
> > memory.  Does anyone have any insight into which Solr component(s) use
> this
> > and whether this kind of memory consumption is to be expected?
>
> I can't really say what WeakIdentityMap is doing.  I can trace the only
> usage in Lucene to MMapDirectory, but it doesn't make a lot of sense for
> this to use a lot of memory, unless this is the source of the memory
> misreporting that Java 7 seems to do with MMap.  See this message in a
> recent thread on this mailing list:
>
>
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201403.mbox/%3c53285ca1.9000...@elyograg.org%3E
>
> If you have a lot of facets, one approach for performance is to use
> facet.method=enum so that your Java heap does not need to be super large.
>
> This does not actually reduce the overall system memory requirements.
> It just shifts the responsibility for caching to the operating system
> instead of Solr, and requires that you have enough memory to put a
> majority of the index into the OS disk cache.  Ideally, there would be
> enough RAM for the entire index to fit.
>
> http://wiki.apache.org/solr/SolrPerformanceProblems
>
> Another option for facet memory optimization is docValues.  One caveat:
> It is my understanding that the docValues content is the same as a
> stored field.  Depending on your schema definition, this may be
> different than the indexed values that facets normally use.  The
> docValues feature also helps with sorting.
>
> Thanks,
> Shawn
>
>