Solr grouping performace

2013-08-05 Thread Alok Bhandari
Hello , I need some functionality for which I found that grouping is the most suited feature. I want to know about performance issue associated with it. On some posts I found that performance is an bottleneck but want to know that if I am having 3 million records with 0.5 million distinct values

Re: Solr 4.3 log4j

2013-08-05 Thread Prasi S
It didn't work for both options.. On Mon, Aug 5, 2013 at 12:19 PM, Shawn Heisey wrote: > On 8/5/2013 12:19 AM, Prasi S wrote: > > Im using solr 4.3 to setup solrcloud. I haev placed all jar files in a > > folder zoo-lib. I have also placed the jar fiels from > /solr/example/lib/ext > > to zoo-l

Re: Solr 4.3 log4j

2013-08-05 Thread Prasi S
I could see there is a change in the logging for solr from 4.3 onwards and the steps for setting it up right in tomcat. But this gives a problem while loading configurations to Zookeeper . Am i missing anything. On Mon, Aug 5, 2013 at 12:51 PM, Prasi S wrote: > It didn't work for both options.

Re: Solr grouping performace

2013-08-05 Thread Paul Masurel
Collapsing is not that slow actually. With a high number of groups, you may just have to let group.ngroups set to false. If you need to get the overall number of groups, you may have to patch lucene. https://issues.apache.org/jira/browse/LUCENE-3972?page=com.atlassian.jira.plugin.system.issuetab

Transform data at index time: country -> continent

2013-08-05 Thread Christian Köhler - ZFMK
Hi, I am indexing data from a mysql data source. Each record contains the field "country". I am looking for a suitable way to create a field "continent" at indexing time. A list with the information country -> continent is given. Writing a script and calling it as a transformer in the sql query

Transform data at index time: country -> continent

2013-08-05 Thread Christian Köhler - ZFMK
Hi, I am indexing data from a mysql data source. Each record contains the field "country". I am looking for a suitable way to create a field "continent" at indexing time. A list with the information country -> continent is given. Writing a script and calling it as a transformer in the sql query

Transform data at index time: country -> continent

2013-08-05 Thread Christian Köhler - ZFMK
Hi, I am indexing data from a mysql data source. Each record contains the field "country". I am looking for a suitable way to create a field "continent" at indexing time. A list with the information country -> continent is given. Writing a script and calling it as a transformer in the sql query

Transform data at index time: country -> continent

2013-08-05 Thread Christian Köhler - ZFMK
Hi, I am indexing data from a mysql data source. Each record contains the field "country". I am looking for a suitable way to create a field "continent" at indexing time. A list with the information country -> continent is given. With my limited knowledge of solr writing a script and calling it

Transform data at index time: country -> continent

2013-08-05 Thread Christian Köhler - ZFMK
Hi, I am indexing data from a mysql data source. Each record contains the field "country". I am looking for a suitable way to create a field "continent" at indexing time. A list with the information country -> continent is given. With my limited knowledge of solr writing a script and calling it

solr - using fq parameter does not retrieve an answer

2013-08-05 Thread Mysurf Mail
When I query using http://localhost:8983/solr/vault/select?q=*:* I get reuslts including the following ... ... 7 ... Now I try to get only that row so I add to my query fq=VersionNumber:7 http://localhost:8983/solr/vault/select?q=*:*&fq=VersionNumber:7 And I get nothing. Any idea?

"optimize" index : impact on performance [Republished]

2013-08-05 Thread Anca Kopetz
Hi, [I am sending again my message to the mailing list, as well as Shawn's reply. Thanks Shawn for your explanations] We are trying to improve the performance of our Solr Search application in terms of QPS (queries per second). We tuned SOLR settings (e.g. mergeFactor=3), launched several ben

Top Ten Terms

2013-08-05 Thread Furkan KAMACI
Hi; When I click Schema Browser at Admin Page and load term info for a field I get top ten terms. When I click question mark near it, it redirects me to Solr query page. Query is that at page: http://localhost:8983/solr/#/collection1/query?q=content:[* TO *] What is the relation between that que

Re: Transform data at index time: country -> continent

2013-08-05 Thread Christian Köhler - ZFMK
Hi, please excuse the multiple emails to the list. There is a mailserver issue - our admin has fixed it (he said ...). @ list-admin: you may delete my previous duplicate mails (9:59, 10:01 an 10:34) from the list. Sorry for the noise! Chris -- Christian Köhler Tel.: 0228 9122-433 Zoologisches

Re: Transform data at index time: country -> continent

2013-08-05 Thread Raymond Wiker
Don't know about "best practice", but to me, the obvious solution would be to have a database table holding the relationships between countries and continents, and using a join to get the continent. On Mon, Aug 5, 2013 at 9:59 AM, Christian Köhler - ZFMK wrote: > Hi, > > I am indexing data from

Re: Transform data at index time: country -> continent

2013-08-05 Thread Christian Köhler - ZFMK
Hi, to have a database table holding the relationships between countries and continents, and using a join to get the continent. I forgot to mention: I only have reading access to the database. Regards Chris -- Christian Köhler Tel.: 0228 9122-433 Zoologisches Forschungsmuseum Alexander Koen

SOLR FieldCopyProcessorFactory

2013-08-05 Thread Luís Portela Afonso
Hi, Exists something like FieldCopyProcessorFactory. I know there is a CloneFieldProfessor, but i'm interested to do an append. Is that possible? Many Thanks smime.p7s Description: S/MIME cryptographic signature

Re: Invalid UTF-8 character 0xfffe during shard update

2013-08-05 Thread Federico Chiacchiaretta
Hi, I reproduced the bug on solr 4.4.0. The bug is specific to SolrCloud, so the bug occurs only when data has to be forwarded to another node (say I start dataimport on node1 and it forwards data to node2). Here is the log I found on target node: ERROR - 2013-08-05 11:57:48.739; org.apache.solr.c

Re: Top Ten Terms

2013-08-05 Thread Stefan Matheis
Perhaps the Question-Mark is not the best Icon for what it does .. suggestions always welcome :) But indeed it "only" lead you to the query-interface, pre-defining "content:[* TO *]" (or whatever the name of the selected field is) as a default-query. It's not especially related to the terms ..

Re: Top Ten Terms

2013-08-05 Thread Furkan KAMACI
I can open an issue at Jira and replace it with magnifier :) 2013/8/5 Stefan Matheis > Perhaps the Question-Mark is not the best Icon for what it does .. > suggestions always welcome :) But indeed it "only" lead you to the > query-interface, pre-defining "content:[* TO *]" (or whatever the nam

Re: Measuring SOLR performance

2013-08-05 Thread Dmitry Kan
Hi Roman, No problem. Still trying to launch the thing.. The query with the added -t parameter generated an error: 1. python solrjmeter.py -a -x ./jmx/SolrQueryTest.jmx -q ./queries/demo/demo.queries -s localhost -p 8983 -a --durationInSecs 60 -R test -t /solr/statements [passed relative path

SolrCloud RemoteSolrException: We are not the leader

2013-08-05 Thread Alexey Kozhemiakin
Dear All, We are facing strange issue with SolrCloud (4.4 with Embedded Zookeeper). Cluster consists of 2 shards and 4 nodes. 4th node cannot be added to cluster and stays in "recovering" state with following error in logs. Picture from admin cloud interface http://imageshack.us/photo/my-images

Re: Percolate feature?

2013-08-05 Thread Charlie Hull
On 03/08/2013 00:50, Mark wrote: We have a set number of known terms we want to match against. In Index: "term one" "term two" "term three" I know how to match all terms of a user query against the index but we would like to know how/if we can match a user's query against all the terms in the

SolrCloud requires fixed ip address?

2013-08-05 Thread Alexey Kozhemiakin
Dear All, Our SolrCloud cluster(4 nodes, 4 shards, Embedded Zookeeper) failed to start after VMs we started after weekend. We shut down 4VM in our private cloud for weekend and started SOLR in the same order as they were initialized - first zookeeper-hosting node and then 3 other nodes. Unfor

Re: Querying a specific core in solr cloud

2013-08-05 Thread Erick Erickson
bq: This should be a bug right? I actually don't think so, although one can argue it either way. The point of SolrCloud is to try to work as robustly as possible. &distrib=false is intended to be a debugging tool rather than something that's part of "normal" operation IMO. If you really want to u

Re: Encountered invalid class name

2013-08-05 Thread Erick Erickson
You don't have the relevant Hadoop jars in your classpath. At least one of these classes is in hadoop-hdfs-2.0.5-alpha.jar Which, which I did just the normal jetty start of the 4.4 distro were in: ./solr-webapp/webapp/WEB-INF/lib/hadoop-hdfs-2.0.5-alpha.jar So I'd guess you didn't get them into th

Collection - loadOnStartup

2013-08-05 Thread Srivatsan
Hi, I am using solr -4.3.0 for my search application. I will create collection via CollectionAPI. I tried to pass "loadOnStartup" value also. But that approach didnt work. My question is, How to set loadOnStartup to false for cores getting created by CollectionsAPI? I am creating cores by the

RE: Document Similarity Algorithm at Solr/Lucene

2013-08-05 Thread Alexey Kozhemiakin
We considered MLT component to implemented a sort of "near exact duplicate detection" - which is probably very similar to your task. http://wiki.apache.org/solr/MoreLikeThis You may think of MoreLikeThis as a two phase process (transform a document to query and run it): 1a) it tokeniz

Re: Collection - loadOnStartup

2013-08-05 Thread Erick Erickson
You haven't given us much information to go on. _how_ does it fail? What do the logs show? Any error returned? What is the response from the server? Is zookeeper showing any problems? Best Erick On Mon, Aug 5, 2013 at 7:01 AM, Srivatsan wrote: > Hi, > > I am using solr -4.3.0 for my search appl

Re: additional requests sent to solr

2013-08-05 Thread Erick Erickson
Why do you care? Is this causing you trouble? In general distributed search requires two round trips to the "other" shards. The first query gets the top N, those are returned to the originator (just a list of IDs and sort criteria, often score). The originator then assembles the final top N, but th

Re: Solr round ratings to nearest integer value

2013-08-05 Thread Erick Erickson
If you don't want to change the SQL query, then you probably need to write a custom update component, which is not hard to do. Otherwise, the suggestions already offered are viable. Best Erick On Mon, Aug 5, 2013 at 1:21 AM, Thyagaraj wrote: > Hello Erick, > > Is it possible to without changi

Field append

2013-08-05 Thread Luís Portela Afonso
Hi there, Is that possible to append two fields on solr? i would like to append to filters with a custom delimiter. Is that possible? I saw something like a CloneFieldUpdateProcessor, but when i try to use, solr says that cannot find the class. I saw that in the follow site: https://issues.apac

Re: Performance question on Spatial Search

2013-08-05 Thread Steven Bower
So after re-feeding our data with a new boolean field that is true when data exists and false when it doesn't our search times have gone from avg of about 20s to around 150ms... pretty amazing change in perf... It seems like https://issues.apache.org/jira/browse/SOLR-5093 might alleviate many peopl

Re: Collection - loadOnStartup

2013-08-05 Thread Srivatsan
No errors in zookeeper and solr. I m using CloudSolrServer for creating collections as said above.I just want to set loadOnStartup to false for cores in solr.xml. I dont want all cores to loadonstartup. Hence when creating collection, i m trying to set this parameter to false. But still i m gettin

Re: Field append

2013-08-05 Thread Jack Krupansky
Jiras cover work in progress, so sometimes the intermediate comments don’t reflect the final API. See: http://lucene.apache.org/solr/4_4_0/solr-core/org/apache/solr/update/processor/CloneFieldUpdateProcessorFactory.html -- Jack Krupansky From: Luís Portela Afonso Sent: Monday, August 05, 2013

Re: Encountered invalid class name

2013-08-05 Thread Artem Karpenko
Hi, I am able to reproduce the problem. Simple JBoss install, putting solr.war into deployments/ directory or deploying via web-interface - results are the same, warning at startup. Checked WEB-INF/lib folder - relevant JARs are there. Weird, since basically the whole Solr code is in those li

Re: Query to return different types of documents based on a field

2013-08-05 Thread André Maldonado
Hi Jack, thanks for your reply and sorry for my bad english. Yes, I have other relevancy rules. All of them are calculated and stored in a single field, so we just order based on that. I think I understood what you said, but how I will paginate? I'll have to get all the documents for this query f

Re: Query to return different types of documents based on a field

2013-08-05 Thread Jack Krupansky
Unless you disclose your specific relevancy requirements, there's not much we can do to help you. So, based on what you have disclosed, the answer is that you need to "over query", request say twice as many documents as you need and then filter on the application side. And then you may need

Re: Transform data at index time: country -> continent

2013-08-05 Thread Shawn Heisey
On 8/5/2013 3:02 AM, Christian Köhler - ZFMK wrote: >> to have a database table holding the relationships between countries and >> continents, and using a join to get the continent. > > I forgot to mention: I only have reading access to the database. Somebody's got to write something. If you don

Re: Collection - loadOnStartup

2013-08-05 Thread didier deshommes
For Solr 4.3.0, I don't think you can pass loadOnStartup to the Collections API, although the Cores API accepts it. That's been my experience anyway. On Mon, Aug 5, 2013 at 6:27 AM, Srivatsan wrote: > No errors in zookeeper and solr. I m using CloudSolrServer for creating > collections as said

Re: SOLR FieldCopyProcessorFactory

2013-08-05 Thread Jack Krupansky
You can use the Clone update processor to copy from one field to another, and then use the Concat update processor to combine the multiple values of that field into one. -- Jack Krupansky -Original Message- From: Luís Portela Afonso Sent: Monday, August 05, 2013 5:40 AM To: solr-user

Re: Transform data at index time: country -> continent

2013-08-05 Thread Jack Krupansky
You can write a brute force JavaScript script using the StatelessScript update processor that hard-codes the mapping. -- Jack Krupansky -Original Message- From: Christian Köhler - ZFMK Sent: Monday, August 05, 2013 5:02 AM To: solr-user@lucene.apache.org Subject: Re: Transform data at

Re: Invalid UTF-8 character 0xfffe during shard update

2013-08-05 Thread Shawn Heisey
On 8/1/2013 7:20 AM, Federico Chiacchiaretta wrote: > on data import from a PostgreSQL db, I get the following error in solr.log: > > ERROR - 2013-08-01 09:51:00.217; org.apache.solr.common.SolrException; > shard update error RetryNode: > http://172.16.201.173:8983/solr/archive/:org.apache.solr.cl

Re: solr - using fq parameter does not retrieve an answer

2013-08-05 Thread Jack Krupansky
Is VersionNumber an "indexed" field, or just "stored"? -- Jack Krupansky -Original Message- From: Mysurf Mail Sent: Monday, August 05, 2013 4:35 AM To: solr-user@lucene.apache.org Subject: solr - using fq parameter does not retrieve an answer When I query using http://localhost:8

Re: solr - using fq parameter does not retrieve an answer

2013-08-05 Thread Shawn Heisey
On 8/5/2013 2:35 AM, Mysurf Mail wrote: > When I query using > > http://localhost:8983/solr/vault/select?q=*:* > > I get reuslts including the following > > > ... > ... > 7 > ... > > > Now I try to get only that row so I add to my query fq=VersionNumber:7 > > http://localhost:8983/so

Re: Performance question on Spatial Search

2013-08-05 Thread Shawn Heisey
On 8/5/2013 7:13 AM, Steven Bower wrote: > So after re-feeding our data with a new boolean field that is true when > data exists and false when it doesn't our search times have gone from avg > of about 20s to around 150ms... pretty amazing change in perf... It seems > like https://issues.apache.org

Re: Encountered invalid class name

2013-08-05 Thread Artem Karpenko
OK, found the source of the warnings. hadoop-hdfs jar contains META-INF/services declaration with some service provider classes. The classes JBoss warns about are all inner classes. This an issue of JBoss - see https://issues.jboss.org/browse/WFLY-1401 - and seems to be resolved in version 8-al

Re: Performance question on Spatial Search

2013-08-05 Thread David Smiley (@MITRE.org)
From: "Steven Bower-2 [via Lucene]" mailto:ml-node+s472066n4082569...@n3.nabble.com>> Date: Monday, August 5, 2013 9:14 AM To: "Smiley, David W." mailto:dsmi...@mitre.org>> Subject: Re: Performance question on Spatial Search So after re-feeding our data with a new boolean field that is true when

Re: additional requests sent to solr

2013-08-05 Thread alxsss
I care about performance. Since the data is too big the query with terms becomes to long and slows performance. bq --- In general distributed searchrequires two round trips to the "other" shards." --- In this case I have three queries to solr. The third one is with {!terms..., which I do not u

[Announce] Apache Solr 4.4 with RankingAlgorithm 1.5.1 available now -- adds complex-lsa algorithm (simulates human language acquisition and recognition)

2013-08-05 Thread Nagendra Nagarajayya
I am very excited to announce the availability of Solr 4.4 with RankingAlgorithm 1.5.1. Solr 4.4 with RankingAlgorithm 1.5.1 adds a new algorithm complex-lsa. complex-lsa simulates human language acquisition and recognition (see demo ) and can ret

Re: "optimize" index : impact on performance [Republished]

2013-08-05 Thread Anca Kopetz
Hi, We already did some benchmarks during optimize and we haven't noticed a big impact on overall performance of search. The benchmarks' results were almost the same with vs. without running optimization. We have enough free RAM for the two OS disk caches during optimize (15 GB represents the

Re: Invalid UTF-8 character 0xfffe during shard update

2013-08-05 Thread Federico Chiacchiaretta
Hi Shawn, thanks for your answer. >From the docs you linked i found: "This property is only relevent for server versions less than or equal to 7.2". I'm using version 9.1, I gave it a try but unfortunately I had no luck. Besides, I checked encoding settings on DB and it's UTF-8. Please note that

Re: Invalid UTF-8 character 0xfffe during shard update

2013-08-05 Thread Raymond Wiker
I think #xfffe is special; it is used as a "byte order mark" to identify the encoding used. In that case, it should only appear at the beginning of the document. Sent from my iPhone On 5 Aug 2013, at 17:19, Federico Chiacchiaretta wrote: > Hi Shawn, > thanks for your answer. > From the docs

Re: Problems matching delimited field

2013-08-05 Thread Mark
That was it… thanks On Aug 2, 2013, at 3:27 PM, Shawn Heisey wrote: > On 8/2/2013 4:16 PM, Robert Zotter wrote: >> The problem is the query get's expanded to "1 Foo" not ( "1" OR "Foo") >> >> 1Foo >> 1Foo >> +DisjunctionMaxQuery((name_textsv:"1 foo")) () >> +(name_textsv:"1 foo") () >> >> DisM

solr qtime suddenly increased in production env

2013-08-05 Thread adfel70
I have a solr cluster of 7 shards, replicationFactor 2, running on 7 physical machines. Machine spec: cpu: 16 memory: 32gb storage is on local disks Each machine runs 2 solr processes, each process with 6gb memory to jvm. The cluster currently has 330 million documents, each process around 30gb o

Re: solr qtime suddenly increased in production env

2013-08-05 Thread Shawn Heisey
On 8/5/2013 10:17 AM, adfel70 wrote: I have a solr cluster of 7 shards, replicationFactor 2, running on 7 physical machines. Machine spec: cpu: 16 memory: 32gb storage is on local disks Each machine runs 2 solr processes, each process with 6gb memory to jvm. The cluster currently has 330 millio

Re: solr qtime suddenly increased in production env

2013-08-05 Thread adfel70
Thanks for your detailed answer. Some followup questions: 1. Are there any tests I can make to determine 100% that this is a "not enough RAM" scenario"? 2. Sounds like I always need to have as much RAM as the size of the index I this really a MUST for getting good search performance? Shawn Hei

Re: Measuring SOLR performance

2013-08-05 Thread Roman Chyla
Hi Dmitry, So I think the admin pages are different on your version of solr, what do you see when you request... ? http://localhost:8983/solr/admin/system?wt=json http://localhost:8983/solr/admin/mbeans?wt=json http://localhost:8983/solr/admin/cores?wt=json If your core -t was '/solr/statements',

Re: solr qtime suddenly increased in production env

2013-08-05 Thread Shawn Heisey
On 8/5/2013 11:27 AM, adfel70 wrote: Thanks for your detailed answer. Some followup questions: 1. Are there any tests I can make to determine 100% that this is a "not enough RAM" scenario"? For heap size problems, turn on GC logging. Look at the log or run it through an analysis tool like G

Re: Percolate feature?

2013-08-05 Thread Mark
> "can match a user's query against all the terms in the index" - that's > exactly what Lucene and Solr have done since Day One, for all queries. > Percolate actually does the opposite - matches an input document against a > registered set of queries - and doesn't match against indexed documents

Re: Invalid UTF-8 character 0xfffe during shard update

2013-08-05 Thread Federico Chiacchiaretta
Hi Raymond, I agree with you, 0xfffe is a special character, that is why I was asking how it's handled in solr. In my document, 0xfffe does not appear at the beginning, it's in the content. Just an update about testing I'm doing: in a SolrCloud two shards environment, if I launch dataimport on one

Re: "optimize" index : impact on performance

2013-08-05 Thread Chris Hostetter
: Subject: "optimize" index : impact on performance : References: <1375381044900-4082026.p...@n3.nabble.com> : In-Reply-To: <1375381044900-4082026.p...@n3.nabble.com> https://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing li

Searching on multiple cores using MultiSearcher

2013-08-05 Thread Zhang, Lisheng
Hi, At lucene level we have MultiSearcher to search a few cores at the same time with same query, at solr level can we perform such search (if using same config/schema)? Here I donot mean to search across shards of the same collection but independent collections? Thanks very much for helps, L

Re: Invalid UTF-8 character 0xfffe during shard update

2013-08-05 Thread Chris Hostetter
: I agree with you, 0xfffe is a special character, that is why I was asking : how it's handled in solr. : In my document, 0xfffe does not appear at the beginning, it's in the : content. Unless i'm missunderstanding something (and it's very likely that i am)... 0xfffe is not a special character -

Re: Percolate feature?

2013-08-05 Thread Jack Krupansky
Fine, then write the query that way: +foo +bar baz But it still doesn't sound as if any of this relates to prospective search/percolate. -- Jack Krupansky -Original Message- From: Mark Sent: Monday, August 05, 2013 2:11 PM To: solr-user@lucene.apache.org Subject: Re: Percolate feat

Re: Invalid UTF-8 character 0xfffe during shard update

2013-08-05 Thread Robert Muir
On Mon, Aug 5, 2013 at 11:42 AM, Chris Hostetter wrote: > > : I agree with you, 0xfffe is a special character, that is why I was asking > : how it's handled in solr. > : In my document, 0xfffe does not appear at the beginning, it's in the > : content. > > Unless i'm missunderstanding something (an

Re: Query to return different types of documents based on a field

2013-08-05 Thread André Maldonado
Jack, this is what I thought. It will be near impossible to achieve this. I will negotiate with the rest of the team to simplify this. Thanks a lot. * -- * *"E conhecereis a verdade, e a verdade vos liber

Re: Invalid UTF-8 character 0xfffe during shard update

2013-08-05 Thread Shawn Heisey
On 8/5/2013 12:12 PM, Federico Chiacchiaretta wrote: Hi Raymond, I agree with you, 0xfffe is a special character, that is why I was asking how it's handled in solr. In my document, 0xfffe does not appear at the beginning, it's in the content. I believe that 0xfffe not a valid UTF-8 character, a

Re: Invalid UTF-8 character 0xfffe during shard update

2013-08-05 Thread Chris Hostetter
: > 0xfffe is not a special character -- it is explicitly *not* a character in : > Unicode at all, it is set asside as "not a character." specifically so : > that the character 0xfeff can be used as a BOM, and if the BOM is read : > incorrectly, it will cause an error. : : XML doesnt allow contro

Re: Invalid UTF-8 character 0xfffe during shard update

2013-08-05 Thread Steve Rowe
Unicode noncharacters are perfectly valid for the purpose of interchange (though as Robert points out, XML has its own ideas about this, separately from the Unicode standard). From : Q: Are noncharacters invalid in Unicode strings and UTFs?

Re: Invalid UTF-8 character 0xfffe during shard update

2013-08-05 Thread Robert Muir
On Mon, Aug 5, 2013 at 3:03 PM, Chris Hostetter wrote: > > : > 0xfffe is not a special character -- it is explicitly *not* a character in > : > Unicode at all, it is set asside as "not a character." specifically so > : > that the character 0xfeff can be used as a BOM, and if the BOM is read > : >

Re: Field append

2013-08-05 Thread Chris Hostetter
: I saw something like a CloneFieldUpdateProcessor, but when i try to use, : solr says that cannot find the class. I saw that in the follow site: : https://issues.apache.org/jira/browse/SOLR-2599 : : In the comments i saw: : You're looking at a very old comment. the issue summary tells you t

Re: Invalid UTF-8 character 0xfffe during shard update

2013-08-05 Thread Sundararaju, Shankar
The problem is that even though unicode point \u and \uFFFE are valid UTF-8 characters, they will not be parsed by standards conforming XML parsers. There is something called UTF-8 replacement character \uFFFD that can be used to replace such characters. While indexing docs, replace all such ch

Re: Percolate feature?

2013-08-05 Thread Mark
Still not understanding. How do I know which words to require while searching? I want to search across all documents and return ones that have all of their terms matched. >> I came across the following from ElasticSearch >> (http://www.elasticsearch.org/guide/reference/api/percolate/) and it s

Re: Invalid UTF-8 character 0xfffe during shard update

2013-08-05 Thread Raymond Wiker
On Aug 5, 2013, at 20:12 , Federico Chiacchiaretta wrote: > Hi Raymond, > I agree with you, 0xfffe is a special character, that is why I was asking > how it's handled in solr. > In my document, 0xfffe does not appear at the beginning, it's in the > content. > > Just an update about testing I'm d

Re: Customize Velocity Output, Utility Class or Custom Tool

2013-08-05 Thread O. Olson
Thank you very much *Erik*. At this point I have trouble compiling Solr /(I needed help from the IRC)/, so I am not qualified to submit a patch. However, now I know where this location is, I might consider creating my own tool and putting it in there :-). Thanks again, because I don’t think anyone

Re: Invalid UTF-8 character 0xfffe during shard update

2013-08-05 Thread Federico Chiacchiaretta
No, the content has no XML tags included (hope I understood what you were asking here). Federico 2013/8/5 Raymond Wiker > On Aug 5, 2013, at 20:12 , Federico Chiacchiaretta < > federico.c...@gmail.com> wrote: > > Hi Raymond, > > I agree with you, 0xfffe is a special character, that is why I wa

Best phonetic configuration?

2013-08-05 Thread SolrLover
I am trying to use phonetic algorithm to perform (approx) search but I need some help on finalizing the algorithm since each algorithm has its pros and cons. For Ex: Most of the phonetic algorithms matches 'tattoo' for the keyword 'Toyota'. Some fail to match hedison when searched for Hudson.. I

Re: Percolate feature?

2013-08-05 Thread Jack Krupansky
Percolate does not search across documents, it searches across registered queries for a single input document. As such, it still seems irrelevant to your desire to "search across all documents". You still haven't explained how you can't do what you want using basic, plain Lucene search. Now,

Real user :)

2013-08-05 Thread Erling Løken Andersen
Hi, I'm a real human being with thoughts and feelings. I'd love to be added to the ContributorsGroup. My username is stormen :) Cheers, Erling Andersen CTO @ Gulindex.no

Re: Real user :)

2013-08-05 Thread Steve Rowe
Erlang, I've added you to the Solr ContributorsGroup page. - Steve On Aug 5, 2013, at 5:31 PM, Erling Løken Andersen wrote: > Hi, > > I'm a real human being with thoughts and feelings. I'd love to be > added to the ContributorsGroup. My username is stormen :) > > Cheers, > Erling Andersen > CT

Re: Suggest aka "autocomplete" request handler with solr 4.4

2013-08-05 Thread Utkarsh Sengar
Bumping this one, is this feature maintained anymore? Thanks, -Utkarsh On Fri, Aug 2, 2013 at 2:27 PM, Utkarsh Sengar wrote: > I am trying to get autocorrect and suggest feature work on my solr 4.4 > setup. > > As recommended here: http://wiki.apache.org/solr/Suggester, this is my > solrconfig:

Re: Suggest aka "autocomplete" request handler with solr 4.4

2013-08-05 Thread Jack Krupansky
What does your allText field's type look like? Does it lower-case terms like Apple->apple? Are you sure that "apple" is indexed in that field? For comparison, or an alternative, try using the Solr terms component to see what terms are actually indexed in that field: curl "http://localhost:8

Re: Suggest aka "autocomplete" request handler with solr 4.4

2013-08-05 Thread Chris Hostetter
: Where "allText" is a copy field which indexes all the content I have in : document title, description etc. what does the field & fieldType of "allText" look like? : I have reindexed my data after adding this config (i.e. loading the whole : dataset again via UpdateCSV), also tried to reload th

Re: Percolate feature?

2013-08-05 Thread Chris Hostetter
: Subject: Percolate feature? can you give a more concrete, realistic example of what you are trying to do? your synthetic hypothetical example is kind of hard to make sense of. your Subject line and comment that the "percolate" feature of elastic search sounds like what you want seems to have

Re: Percolate feature?

2013-08-05 Thread Lance Norskog
Cool! On 08/05/2013 03:34 AM, Charlie Hull wrote: On 03/08/2013 00:50, Mark wrote: We have a set number of known terms we want to match against. In Index: "term one" "term two" "term three" I know how to match all terms of a user query against the index but we would like to know how/if we ca

Re: Encountered invalid class name

2013-08-05 Thread anpm1989
Hi Artem Karpenko, Thank you for finding the reason. It seem The Jboss deployment scanner does not accept dollar characters in its validation pattern. So, i think that my Solr app can not work 100%, isn't it? regards, An Pham Minh -- View this message in context: http://lucene.472066.n3.nabbl

Re: Encountered invalid class name

2013-08-05 Thread anpm1989
Hi Erick Erickson, Thank you for your idea. But i don't get your point, sorry about that :D. I guess that i have all relevant Hadoop jars in solr.war/solr/WEB-INF/lib. But i think this is a good idea, i will check it latter Regards, An MinhPham -- View this message in context: http://lucen

Boosting in function queries?

2013-08-05 Thread SolrLover
I am trying to use the below query to boost the score of dismax component but it doesn't seem to work .. _query_:"{!dismax qf=Fname v=$f_name}"^8.0 OR _query_:"{!dismax qf=Lname v=$l_name}"^8.0 Can someone let me know a way to boost Dismax / function queries without using bq? -- View this m

Re: Collection - loadOnStartup

2013-08-05 Thread Srivatsan
Then if so, how to set loadOnStartup for collectionsAPI in solr4.4 ??? -- View this message in context: http://lucene.472066.n3.nabble.com/Collection-loadOnStartup-tp4082531p4082731.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Invalid UTF-8 character 0xfffe during shard update

2013-08-05 Thread Raymond Wiker
Ok, let me rephrase that slightly: does your database extraction include BLOBs or CLOBs that are actually complete documents, that might be UTF-8 encoded text? >From the stack trace in your second post, it seems that the error occurs while parsing an XML file uploaded via the UpdateRequestHandler.