Re: Issue Using Solr 5.3 Authentication and Authorization Plugins

2015-09-02 Thread Kevin Lee
I’ve found that completely exiting Chrome or Firefox and opening it back up re-prompts for credentials when they are required. It was re-prompting with the /browse path where authentication was working each time I completely exited and started the browser again, however it won’t re-prompt

Re: 'missing content stream' issuing expungeDeletes=true

2015-09-02 Thread Derek Poh
There are around 6+ millions documents in the collection. Each document (or product record) is unqiue in the collection. When we found out the document has a docfreq of 2, we did a query on the document's product id and indeed 2 documents were returned. We suspect 1 of them is deleted but not

Using bq param for negative boost

2015-09-02 Thread Kevin Lee
Hi, I’m trying to boost all results using the bq param with edismax where termA and termB do not appear in the field, but if phraseC appears it doesn’t matter if termA and termB appear. The following works and boosts everything that doesn’t have termA and termB in myField so the effect is

Re: Issue Using Solr 5.3 Authentication and Authorization Plugins

2015-09-02 Thread Noble Paul
" However, after uploading the new security.json and restarting the web browser," The browser remembers your login , So it is unlikely to prompt for the credentials again. Why don't you try the RELOAD operation using command line (curl) ? On Tue, Sep 1, 2015 at 10:31 PM, Kevin Lee

Highlighting snippets truncated when matching large number of indexed documents

2015-09-02 Thread hsharma mailinglists
Hi there, I'm observing that the snippets being returned in the highlighting section of the response are getting truncated. However, this behavior is being seen only when the query matches a large number of documents and the results requested are near the end of the Solr-returned overall results

Re: Solr cloud hangs, log4j contention issue observed

2015-09-02 Thread Arnon Yogev
Thank you Shawn, We are indeed using Tomcat, maxThreads was set to 2000 (Normally seen <600 active threads under load). I attached the complete stack trace of http-bio-8443-exec-37460 below. The thread is marked as "Waiting on Condition", and does not mention any lock it's waiting for.

Please add me to SolrWiki contributors

2015-09-02 Thread Gaurav Kumar
Hi I am working on writing some open source tool for Solr Camel component, it would be great if you can add me to list of contributors. Also I realized that you guys have upgraded the wiki to Solr 5.3, but we are using Solr 4, and suddenly now there is no information available for the older

Re: Re: Re: Re: concept and choice: custom sharding or auto sharding?

2015-09-02 Thread scott chu
solr-user,妳好 Sorry ,wrong again. Auto sharding is not implicit router. - Original Message - From: scott chu To: solr-user Date: 2015-09-02, 23:50:20 Subject: Re: Re: Re: concept and choice: custom sharding or auto sharding? solr-user,妳好 Thanks! I'll go back to check my old

Re: String bytes can be at most 32766 characters in length?

2015-09-02 Thread Erick Erickson
Yes, that is an intentional limit for the size of a single token, which strings are. Why not use deduplication? See: https://cwiki.apache.org/confluence/display/solr/De-Duplication You don't have to replace the existing documents, and Solr will compute a hash that can be used to identify

Re: Re: concept and choice: custom sharding or auto sharding?

2015-09-02 Thread scott chu
solr-user,妳好 Do you mean I only have to put 10M documents in one index and copy it to many slaves in a classic Solr master-slave architecture to provide querying serivce on internet, and it won't have obvious downgrade of query performance? But I did have add 1M document into one index on

Re: String bytes can be at most 32766 characters in length?

2015-09-02 Thread Zheng Lin Edwin Yeo
Hi Erick, Yes, i'm trying out the De-Duplication too. But I'm facing a problem with that, which is the indexing stops working once I put in the following De-Duplication code in solrconfig.xml. The problem seems to be with this dedupe line. dedupe true signature false

Re: Re: concept and choice: custom sharding or auto sharding?

2015-09-02 Thread Erick Erickson
bq: Why do you say: "at 10M documents there's rarely a need to shard at all?" Because I routinely see 50M docs on a single node and I've seen over 300M docs on a single node with sub-second responses. So if you're saying that you see poor performance at 1M docs then I suspect there's something

Solr Join support in Multiple Shard

2015-09-02 Thread Maulin Rathod
As per this link (http://wiki.apache.org/solr/Join) Solr Join is supported only for cores in single shard. Is there any plan to support Join Across cores in Multiple Shard?

Re: 'missing content stream' issuing expungeDeletes=true

2015-09-02 Thread Erick Erickson
bq: When we found out the document has a docfreq of 2, we did a query on the document's product id and indeed 2 documents were returned. We suspect 1 of them is deleted but not remove from the index. This is totally inconsistent with how Solr works _if_ these documents had the same value for

Re: Re: Re: concept and choice: custom sharding or auto sharding?

2015-09-02 Thread scott chu
solr-user,妳好 Thanks! I'll go back to check my old environment and that article is really helpful. BTW, I think I got wrong about compositeID. In the reference guide, it said compositeID needs numShards. That means what I describe in question 5 seems wrong cause I intend to plan one shard

Re: String bytes can be at most 32766 characters in length?

2015-09-02 Thread Erick Erickson
_How_ does it fail? You must be seeing something in the logs On Wed, Sep 2, 2015 at 8:29 AM, Zheng Lin Edwin Yeo wrote: > Hi Erick, > > Yes, i'm trying out the De-Duplication too. But I'm facing a problem with > that, which is the indexing stops working once I put in

Re: Solr Join support in Multiple Shard

2015-09-02 Thread Erick Erickson
It's been discussed, but it's likely to have performance problems I'd guess. That said, there's some very interesting stuff being done with Streaming Aggregation and, built on top of that Parallel SQL. But how that applies to your use-case I don't know. Best, Erick On Wed, Sep 2, 2015 at 9:05

Re: concept and choice: custom sharding or auto sharding?

2015-09-02 Thread Shawn Heisey
On 9/2/2015 9:19 AM, scott chu wrote: > Mail > Do you mean I only have to put 10M documents in one index and copy > it to many slaves in a classic Solr master-slave architecture to > provide querying serivce on internet, and it won't have obvious > downgrade of query performance? But I did have

Rules for pre-processing queries

2015-09-02 Thread Siamak Rowshan
Hi all, I need to refine my search results by adding parameters to search query parameters. For example, if user enters "ipad", I want to add a filter query such as ("category=tablets") to refine the search results. I thought a more general solution would be to define rules, that examine the

Cannot search on special characters such as $ or

2015-09-02 Thread Steven White
Hi Everyone, I have the following in my schema: In the text file "wdfftypes.txt", I have this: & => DIGIT $ => DIGIT I also tried: & => ALPHA $ => ALPHA I then index data that contains the string: "~ ! @

Re: String bytes can be at most 32766 characters in length?

2015-09-02 Thread Alexandre Rafalovitch
And that's because you have an incomplete chain. If you look at the full example in solrconfig.xml, it shows: true id false name,features,cat solr.processor.Lookup3Signature Notice, the last two processors.

Re: which solrconfig.xml

2015-09-02 Thread Chris Hostetter
: various $HOME/solr-5.3.0 subdirectories. The documents/tutorials say to edit : the solrconfig.xml file for various configuration details, but they never say : which one of these dozen to edit. Moreover, I cannot determine which version can you please give us a specific examples (ie: urls,

Re: String bytes can be at most 32766 characters in length?

2015-09-02 Thread Zheng Lin Edwin Yeo
Hi Erick, I couldn't really find anything special in the logs. The indexing process just went on normally, but after that when I check the index, there is nothing indexed. This is what I see from the logs. Looks the same as when the indexing works fine. INFO - 2015-09-03 01:24:35.316;

Re: Cannot search on special characters such as $ or

2015-09-02 Thread Erick Erickson
The Admin/Analysis page is your friend. On a quick test $ and & never make it past StandardTokenizerFactory Best, Erick On Wed, Sep 2, 2015 at 5:17 PM, Steven White wrote: > Hi Everyone, > > I have the following in my schema: > >positionIncrementGap="100"

Re: String bytes can be at most 32766 characters in length?

2015-09-02 Thread Zheng Lin Edwin Yeo
Hi Alexandre, Thanks for pointing out the error. I'm able to get the documents to be indexed after adding in the two processors. However, I'm still seeing all the similar documents being search in the content without being de-duplicated. My content is currently indexed as fieldType=text_general.

Re: Position of Document in Listing (Search Result)

2015-09-02 Thread Erick Erickson
Well, you have access to the start parameter, isn't it just start+(ordinal position in the page)? Best, Erick On Wed, Sep 2, 2015 at 7:01 PM, Shayan Haque wrote: > Hi, > > I need to get a document position within a search result for a specific > member, to show them where

Re: Difference between Legacy Facets and JSON Facets

2015-09-02 Thread Zheng Lin Edwin Yeo
> As far as I can see, JSON Facets does not have this delayed mapping mechanism: Every increment requires a call to the segment->global-ordinal map. With a large field this map cannot be in the fast caches. Combine this with a gazillion references and it makes sense that JSON Facets is slower in

Re: Position of Document in Listing (Search Result)

2015-09-02 Thread Erick Erickson
It's entirely unclear what you mean by "position". bq: where for "make and model" his first result comes Comes in what? The search result list? Some a-priori ordering of all the cars that has nothing to do with this search? The results list of everyone's cars that have the same make and model?

Position of Document in Listing (Search Result)

2015-09-02 Thread Shayan Haque
Hi, I need to get a document position within a search result for a specific member, to show them where there result lie for a particular set of filters... I tried using a Solr-Ranking plugin but its outdated, version 3.5 compatible. Is there some other way? Ordinal ranking or any other thing..

Re: Position of Document in Listing (Search Result)

2015-09-02 Thread Shayan Haque
Thanks for the reply Erick. How do I get the position? I am searching on e.g. car model and make, and I want to show on which position the members's first car falls for that specific car model and make. So I tell solr, get listing for the cars with the model and make. I want from that result, if

Re: Difference between Legacy Facets and JSON Facets

2015-09-02 Thread Toke Eskildsen
Yonik Seeley wrote: > Hmmm, well something is really wrong for this orders of magnitude > difference. I've never seen anything like that and we should > definitely try to get to the bottom of it. This might be a wild goose chase, but... Zheng states it is a text field with

Re: is there any way to tell delete by query actually deleted anything?

2015-09-02 Thread Renee Sun
Shawn, thanks for the reply. I have a sharded index. When I re-index a document (vs new index, which is different process), I need to delete the old one first to avoid dup. We all know that if there is only one core, the newly added document will replace the old one, but with multiple core

which solrconfig.xml

2015-09-02 Thread Mark Fenbers
Hi, I've been fiddling with Solr for two whole days since downloading/unzipping it. I've learned a lot by reading 4 documents and the web site. However, there are a dozen or so instances of solrconfig.xml in various $HOME/solr-5.3.0 subdirectories. The documents/tutorials say to edit the

Re: is there any way to tell delete by query actually deleted anything?

2015-09-02 Thread Shawn Heisey
On 9/2/2015 1:30 PM, Renee Sun wrote: > Is there an easy way for me to get the actually deleted document number? I > mean if the query did not hit any documents, I want to know that nothing got > deleted. But if it did hit documents, i would like to know how many were > delete... I do this by

is there any way to tell delete by query actually deleted anything?

2015-09-02 Thread Renee Sun
I run this curl trying to delete some messages : curl 'http://localhost:8080/solr/mycore/update?commit=true=abacd' | xmllint --format - or curl 'http://localhost:8080/solr/mycore/update?commit=true=myfield:mycriteria' | xmllint --format - the results I got is like: % Total% Received %

Re: is there any way to tell delete by query actually deleted anything?

2015-09-02 Thread Mark Ehle
Do a search with the same criteria before and after? On Wed, Sep 2, 2015 at 3:30 PM, Renee Sun wrote: > I run this curl trying to delete some messages : > > curl > 'http://localhost:8080/solr/mycore/update?commit=true= > abacd' > | xmllint --format - > > or > > curl >

Re: Rules for pre-processing queries

2015-09-02 Thread Arcadius Ahouansou
Hello Siamak. You may also want to have a look at 3 related articles, the 3rd part being: http://lucidworks.com/blog/query-autofiltering-extended-language-logic-search/ I would start from the 1st part. Hope this helps a bit. Arcadius. On 2 September 2015 at 21:09, Upayavira

Re: Rules for pre-processing queries

2015-09-02 Thread Upayavira
Do you have a predefined list of such filters? You can do fun things with synonyms: define an ipad->tablet synonym, and use it at query time. Filter out all non-synonym terms in your query time analysis chain, and then use that field as a filter. Upayavira On Wed, Sep 2, 2015, at 09:07 PM,

Re: is there any way to tell delete by query actually deleted anything?

2015-09-02 Thread Erick Erickson
bq: I have a sharded index. When I re-index a document (vs new index, which is different process), I need to delete the old one first to avoid dup No, you do not need to issue the delete in a sharded collection _assuming_ that the doc has the same . Why do you think you do? If it's in some doc

Re: is there any way to tell delete by query actually deleted anything?

2015-09-02 Thread Renee Sun
thanks Shawn... on the other side, I have just created a thin layer webapp I deploy it with solr/tomcat. this webapp provides RESTful api allow all kind of clients in our system to call and request a commit on the certain core on that solr server. I put in with the idea to have a centre/final

Merging documents from a distributed search

2015-09-02 Thread tedsolr
I've read from http://heliosearch.org/solrs-mergestrategy/ that the AnalyticsQuery component only works for a single instance of Solr. I'm planning to "migrate" to the SolrCloud soon and I have a custom AnalyticsQuery module that collapses what I

Re: is there any way to tell delete by query actually deleted anything?

2015-09-02 Thread Renee Sun
Hi Erick... as Shawn pointed out... I am not using solrcloud, I am using a more complicated sharding scheme, home grown... thanks for your response :-) Renee -- View this message in context:

Re: is there any way to tell delete by query actually deleted anything?

2015-09-02 Thread Renee Sun
Hi Shawn, I think we have similar structure where we use frontier/back instead of hot/cold :-) so yes we will probably have to do the same. since we have large customers and some of them may have tera bytes data and end up with hundreds of cold cores the blind delete broadcasting to all of

Re: is there any way to tell delete by query actually deleted anything?

2015-09-02 Thread Shawn Heisey
On 9/2/2015 3:32 PM, Renee Sun wrote: > I think we have similar structure where we use frontier/back instead of > hot/cold :-) > > so yes we will probably have to do the same. > > since we have large customers and some of them may have tera bytes data and > end up with hundreds of cold cores

Re: which solrconfig.xml

2015-09-02 Thread Alexandre Rafalovitch
Have you looked at Admin Web UI in details yet? When you look at the "Overview" page, on the right hand side, it lists a bunch of directories. You want one that says "Instance". Then, your solrconfig.xml is in "conf" directory under that. Regards, Alex. P.s. Welcome! Solr Analyzers,

RE: Rules for pre-processing queries

2015-09-02 Thread Siamak Rowshan
Upayavira, wow! Didn’t think it'd work that well, and would be so easy to do! I do have a predefined list, so synonyms work great! Thanks! Siamak Rowshan | Software Engineer Softmart | 450 Acorn Lane Downingtown, PA 19335 P | 888-763-8627 siamak.rows...@softmart.com

Re: is there any way to tell delete by query actually deleted anything?

2015-09-02 Thread Shawn Heisey
On 9/2/2015 2:24 PM, Renee Sun wrote: > I have a sharded index. When I re-index a document (vs new index, which is > different process), I need to delete the old one first to avoid dup. We all > know that if there is only one core, the newly added document will replace > the old one, but with

Re: Merging documents from a distributed search

2015-09-02 Thread Joel Bernstein
The merge strategy probably won't work for the type of distributed collapse you're describing. You may want to begin exploring the Streaming API which supports real-time map/reduce operations, http://joelsolr.blogspot.com/2015/03/parallel-computing-with-solrcloud.html Joel Bernstein

Re: Issue Using Solr 5.3 Authentication and Authorization Plugins

2015-09-02 Thread Noble Paul
I opened a ticket for the same https://issues.apache.org/jira/browse/SOLR-8004 On Wed, Sep 2, 2015 at 1:36 PM, Kevin Lee wrote: > I’ve found that completely exiting Chrome or Firefox and opening it back up > re-prompts for credentials when they are required. It was

Re: Difference between Legacy Facets and JSON Facets

2015-09-02 Thread Yonik Seeley
On Wed, Sep 2, 2015 at 1:19 AM, Zheng Lin Edwin Yeo wrote: > The type of field is text_general. What are some typical values for this "content" field (i.e. how many different words does the content field contain for each document)? -Yonik > I found that the problem mainly

Re: Strange behavior of solr

2015-09-02 Thread Zheng Lin Edwin Yeo
Is there any error message in the log when Solr stops indexing the file at line 2046? Regards, Edwin On 2 September 2015 at 17:17, Long Yan wrote: > Hey, > I have created a core with > bin\solr create -c mycore > > I want to index the csv sample files from solr-5.2.1 > >

Re: Please add me to SolrWiki contributors

2015-09-02 Thread Shawn Heisey
On 9/1/2015 11:28 PM, Gaurav Kumar wrote: > I am working on writing some open source tool for Solr Camel component, it > would be great if you can add me to list of contributors. > Also I realized that you guys have upgraded the wiki to Solr 5.3, but we are > using Solr 4, and suddenly now there

Re: Strange behavior of solr

2015-09-02 Thread Erik Hatcher
See example/films/README.txt The “name” field is guessed incorrectly (because the first film has name=“.45”, so indexing errors once it hits a name value that is no longer numeric. The README provides a command to define the name field *before* indexing. If you’ve indexed and had the name

Frage zu einem komischen Verhalten

2015-09-02 Thread Long Yan
Guten Tag, ich habe einen Core mit dem folgendem Befehl erstellt bin\solr create -c mycore Wenn ich die Datei film.csv unter solr-5.2.1\example\films\ indexiere, kann solr nur bis die Zeile "2046,Wong Kar-wai,Romance Film|Fantasy|Science Fiction|Drama,,/en/2046_2004,2004-05-20" indexieren.

Strange behavior of solr

2015-09-02 Thread Long Yan
Hey, I have created a core with bin\solr create -c mycore I want to index the csv sample files from solr-5.2.1 If I index film.csv under solr-5.2.1\example\films\, solr can only index this file until the line "2046,Wong Kar-wai,Romance Film|Fantasy|Science

String bytes can be at most 32766 characters in length?

2015-09-02 Thread Zheng Lin Edwin Yeo
Hi, I would like to check, is the string bytes must be at most 32766 characters in length? I'm trying to do a copyField of my rich-text documents content to a field with fieldType=string to try out my getting distinct result for content, as there are several documents with the exact same

Re: Frage zu einem komischen Verhalten

2015-09-02 Thread Hasan Diwan
You might get a better response in English... Vielleicht haben Sie eine bessere Antwort bekommen in... (from Google Translate, as my own German is non-existent) -- H 2015-09-02 2:05 GMT-07:00 Long Yan : > Guten Tag, > ich habe einen Core mit dem folgendem Befehl erstellt >

Re: Difference between Legacy Facets and JSON Facets

2015-09-02 Thread Zheng Lin Edwin Yeo
Q) What are some typical values for this "content" field (i.e. how many different words does the content field contain for each document)? A) They are indexed from word and pdf documents, the highest is 278 pages long (about 372000 bytes when indexed into Solr). There's thousands of different

concept and choice: custom sharding or auto sharding?

2015-09-02 Thread scott chu
I post a question on Stackoverflow http://stackoverflow.com/questions/32343813/custom-sharding-or-auto-sharding-on-solrcloud: However, since this is a mail-list, I repost the question below to request for suggestion and more subtle concept of SolrCloud's behavior on document routing. I want to

Re: concept and choice: custom sharding or auto sharding?

2015-09-02 Thread Erick Erickson
Frankly, at 10M documents there's rarely a need to shard at all. Why do you think you need to? This seems like adding complexity for no good reason. Sharding should only really be used when you have too many documents to fit on a single shard as it adds some overhead, restricts some possibilities