Re: /export handler to stream data using CloudSolrStream: JSONParse Exception

2016-10-20 Thread Joel Bernstein
I suspect this is a bug with improperly escaped json. SOLR-7441 resolved this issue and released in Solr 6.0. There have been a large number of improvements, bug fixes, new features and much better error handling in Solr 6 Streaming Expressions.

Re: For TTL, does expirationFieldName need to be indexed?

2016-10-20 Thread Chetas Joshi
You just need to have indexed=true. It will use the inverted index to delete the expired documents. You don't need stored=true as all the info required by the DocExpirationUpdateProcessorFactory to delete a document is there in the inverted index. On Thu, Oct 20, 2016 at 4:26 PM, Brent

Re: For TTL, does expirationFieldName need to be indexed?

2016-10-20 Thread Brent
Thanks for the reply. Follow up: Do I need to have the field stored? While I don't need to ever look at the field's original contents, I'm guessing that the DocExpirationUpdateProcessorFactory does, so that would mean I need to have stored=true as well, correct? -- View this message in

RE: Load balancing with solr cloud

2016-10-20 Thread Garth Grimm
No matter where you send the update to initially, it will get sent to the leader of the shard first. The leader does a parsing of it to ensure it can be indexed, then it will send it to all the replicas in parallel. The replicas will do their parsing and report back that they have persisted

Re: Load balancing with solr cloud

2016-10-20 Thread Sadheera Vithanage
Thank you very much John and Garth, I've tested it out and it works fine, I can send the updates to any of the solr nodes. If I am not using a zookeeper aware client and If I direct all my queries (read queries) always to the leader of the solr instances,does it automatically load balance

RE: Load balancing with solr cloud

2016-10-20 Thread Garth Grimm
Actually, zookeeper really won't participate in the update process at all. If you're using a "zookeeper aware" client like SolrJ, the SolrJ library will read the cloud configuration from zookeeper, but will send all the updates to the leader of the shard that the document is meant to go to. If

Re: Load balancing with solr cloud

2016-10-20 Thread John Bickerstaff
Others on the list are more expert, but I think your #1 Zookeeper will not get hammered. As I understand it, Solr itself (the leader) will handle farming out the work to the other two Solr nodes. The amount of traffic on the Zookeeper instances should be minimal. Now - could your SolrCloud of 3

Load balancing with solr cloud

2016-10-20 Thread Sadheera Vithanage
Hi again Experts, I have a question related to load balancing in solr cloud. If we have 3 zookeeper nodes and 3 solr instances (1 leader, 2 secondary replicas and 1 shard), when the traffic comes in the primary zookeeper server will be hammered, correct? I understand (or is it wrong) that

/export handler to stream data using CloudSolrStream: JSONParse Exception

2016-10-20 Thread Chetas Joshi
Hello, I am using /export handler to stream data using CloudSolrStream. I am using fl=uuid,space,timestamp where uuid and space are Strings and timestamp is long. My query (q=...) is not on these fields. While reading the results from the Solr cloud, I get the following errors

Re: indexing - offline

2016-10-20 Thread Rallavagu
Thanks Evan for quick response. On 10/20/16 10:19 AM, Tom Evans wrote: On Thu, Oct 20, 2016 at 5:38 PM, Rallavagu wrote: Solr 5.4.1 cloud with embedded jetty Looking for some ideas around offline indexing where an independent node will be indexed offline (not in the

Re: (solrcloud) Importing documents into "implicit" router

2016-10-20 Thread John Bickerstaff
This may help? https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud On Thu, Oct 20, 2016 at 12:09 PM, Customer wrote: > Hey, > > I hope you all are doing well.. > > I got a router with "router.name=implicit" with couple of shards

Re: (solrcloud) Importing documents into "implicit" router

2016-10-20 Thread John Bickerstaff
more specifically, this bit from that page seems like it might be of interest: If you created the collection and defined the "implicit" router at the time of creation, you can additionally define a router.field parameter to use a field from each document to identify a shard where the document

(solrcloud) Importing documents into "implicit" router

2016-10-20 Thread Customer
Hey, I hope you all are doing well.. I got a router with "router.name=implicit" with couple of shards (lets call them shardA and shardB) and got a mysql table ready to import for testing purposes. So for example I want to load half of the data to shardA and the rest - to the shardB. Question

Re: Result Grouping vs. Collapsing Query Parser -- Can one be deprecated?

2016-10-20 Thread Jeff Wartes
I’ll also mention the choice to improve processing speed by allocating more memory, which increases the importance of GC tuning. This bit me when I tried using it on a larger index. https://issues.apache.org/jira/browse/SOLR-9125 I don’t know if the result grouping feature shares the same

Re: indexing - offline

2016-10-20 Thread Tom Evans
On Thu, Oct 20, 2016 at 5:38 PM, Rallavagu wrote: > Solr 5.4.1 cloud with embedded jetty > > Looking for some ideas around offline indexing where an independent node > will be indexed offline (not in the cloud) and added to the cloud to become > leader so other cloud nodes

Soft commit from curl

2016-10-20 Thread Michal Danilák
Does the following command issue soft commit or hard commit? curl http://localhost:8984/solr/update?softCommit=true -H "Content-Type: text/xml" --data-binary '' How to find out which commit was triggered? Can I get it somewhere in logs? Thanks.

indexing - offline

2016-10-20 Thread Rallavagu
Solr 5.4.1 cloud with embedded jetty Looking for some ideas around offline indexing where an independent node will be indexed offline (not in the cloud) and added to the cloud to become leader so other cloud nodes will get replicated. Wonder if this is possible without interrupting the live

Re: Memory Issue with SnapPuller

2016-10-20 Thread Jihwan Kim
This is also the screenshot of jvisualvm. This exception occurred at 2:55PM and 3:40PM and OOME occurs at 3:41PM. SnapPuller - java.lang.InterruptedException at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) at java.util.concurrent.FutureTask.get(FutureTask.java:191) at

Re: Memory Issue with SnapPuller

2016-10-20 Thread Jihwan Kim
Good points. I am able to create this with periodic snap puller and only one http request. When I load the Solr on tomcat, the initial memory usage was between 600M to 800 M. First time, I used 1.5 G and then increased the heap to 3.5G. (When I said 'triple', I meant comparing to the initial

Re: registered User

2016-10-20 Thread Erick Erickson
Done, thanks! On Thu, Oct 20, 2016 at 4:56 AM, kult.n...@googlemail.com wrote: > Hi, > > please add my User "NilsFaupel" of the Solr Wiki to the ContributorsGroup. > > Regards > > Nils

Re: Memory Issue with SnapPuller

2016-10-20 Thread Erick Erickson
You say you tripled the memory. Up to what? Tripling from 500M t0 1.5G isn't likely enough, tripling from 6G to 18G is something else again You can take a look through any of the memory profilers and try to catch the objects (and where they're being allocated). The second is to look at the

Re: Memory Issue with SnapPuller

2016-10-20 Thread Jihwan Kim
Thank you Shawn. I understand the two options. After my own testing with a smaller heap, I increased my heap size more than triple, but OOME happens again with my testing cases under the controlled thread process. Increased heap size just delayed the OOME. Can you provide a feedback on my second

Re: Memory Issue with SnapPuller

2016-10-20 Thread Shawn Heisey
On 10/20/2016 8:44 AM, Jihwan Kim wrote: > We are using Solr 4.10.4 and experiencing out of memory exception. It > seems the problem is cause by the following code & scenario. When you get an OutOfMemoryError exception that tells you there's not enough heap space, the place where the exception

Re: Memory Issue with SnapPuller

2016-10-20 Thread Jihwan Kim
A little more about "At certain timing, this method also throw " SnapPuller - java.lang.InterruptedException at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) at java.util.concurrent.FutureTask.get(FutureTask.java:191) at

Re: Memory Issue with SnapPuller

2016-10-20 Thread Jihwan Kim
Sorry, wrong button was clicked. A little more about "At certain timing, this method also throw " SnapPuller - java.lang.InterruptedException at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) at java.util.concurrent.FutureTask.get(FutureTask.java:191) at

Memory Issue with SnapPuller

2016-10-20 Thread Jihwan Kim
Hi, We are using Solr 4.10.4 and experiencing out of memory exception. It seems the problem is cause by the following code & scenario. This is the last part of a fetchLastIndex method in SnapPuller.java // we must reload the core after we open the IW back up if (reloadCore) {

Re: Facet behavior

2016-10-20 Thread Yonik Seeley
On Thu, Oct 20, 2016 at 8:45 AM, Bastien Latard | MDPI AG wrote: > Hi Yonik, > > Thanks for your answer! > I'm not quite I understood everything...please, see my comments below. > > >> On Wed, Oct 19, 2016 at 6:23 AM, Bastien Latard | MDPI AG >>

group.facet fails when facet on double field

2016-10-20 Thread karel braeckman
Hi, We are trying to upgrade from Solr 4.8 to Solr 6.2. This query: ?q=*%3A*=0=2=json=true=true=mediaObjectId=true=rating=true is returning the following error: null:org.apache.solr.common.SolrException: Exception during facet.field: rating at

Re: Facet behavior

2016-10-20 Thread Bastien Latard | MDPI AG
Hi Yonik, Thanks for your answer! I'm not quite I understood everything...please, see my comments below. On Wed, Oct 19, 2016 at 6:23 AM, Bastien Latard | MDPI AG wrote: I just had a question about facets. *==> Is the facet run on all documents (to pre-process/cache

Filter result of facting query

2016-10-20 Thread Davide Isoardi
Hi all, I needed filtering, for range, the query result of faceting. E.G.: q=*%3A*=id=json=0=true=client=10=false=true I have this result but I would like only if in specific range (from 5 to 9). In this case I would returned only "RoundTeam",703461 and "Hootsuite",575569, {

registered User

2016-10-20 Thread kult.n...@googlemail.com
Hi, please add my User "NilsFaupel" of the Solr Wiki to the ContributorsGroup. Regards Nils