Re: Announce list

2014-02-04 Thread Arie Zilberstein
Hi, Thanks for your answers. I'll give some context: our project uses multiple 3rd party products, Solr is among them. Upgrading versions of 3rd parties can only take place at specific points in time in the development cycle. At these points it would be useful to be able to see what changed in

Re: weird exception on update

2014-02-04 Thread Dmitry Kan
We are still hitting an issue with two cores, each having their own custom query parser. The problem in passing {!qparser} is that the custom query parser can pretty much alter an input query into something that is not desirable for the delete by query operation. Is there any way of specifying

Re: Solr ranking query..

2014-02-04 Thread Varun Thacker
Hi Chris, I think what you are looking for could be solved using the eDismax query parser. https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Query+Parser 1. Your Query Fields ( qf ) would be - urlKeywords^60 title^40 fulltxt^1 2. To check on adultFlag:N you could use

Re: Import data from mysql to sold

2014-02-04 Thread rachun
please see below code.. dataConfig dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://localhost:3306/mydb01 user=root password=/ document entity name=users query=select id,firstname,username from users

Re: Duplicate Facet.FIelds cause same results, should dedupe?

2014-02-04 Thread Varun Thacker
Hi William, I doubt this is a bug. I tried on Solr 4.5.1. Indexed documents using java -jar post.jar *.xml This is the query that I fired - http://localhost:8983/solr/collection1/select?q=*:*wt=jsonindent=truerows=0facet=truefacet.limit=1facet.field=namefacet.field=name And here is the

How to index the data from db using Solandra

2014-02-04 Thread Sathya
Hi Guys... I am new to solandra. How to index the data from database using solandra.. I was configure the Solandra by following this link https://github.com/tjake/Solandra https://github.com/tjake/Solandra and i dont know how to index the data using solandra.. Please help me... -- View

Re: How to index the data from db using Solandra

2014-02-04 Thread Furkan KAMACI
Hi; This is Solr user mailing list. This page: https://github.com/tjake/Solandra/wiki/Solandra-Wiki points to a mail list for Solandra. If you have Solr specific questions you can ask them here. Thanks; Furkan KAMACI 2014-02-04 Sathya sathia.blacks...@gmail.com: Hi Guys... I am new to

Sentence Detection for Highlighting

2014-02-04 Thread Furkan KAMACI
Hi; I want to detect sentences for Turkish documents to generate better Higlighting at Solr 4.6.1 What do you prefer to me for that purpose? Thanks; Furkan KAMACI

RE: Sentence Detection for Highlighting

2014-02-04 Thread Markus Jelsma
Boundary scanner using Java's break iterator: http://wiki.apache.org/solr/HighlightingParameters#hl.boundaryScanner -Original message- From:Furkan KAMACI furkankam...@gmail.com Sent: Tuesday 4th February 2014 12:03 To: solr-user@lucene.apache.org Subject: Sentence Detection for

Solr Searching Issue

2014-02-04 Thread Sathya
Hi Friends, I am working in Solr 4.6.0 from last 2 months. i have indexed the data in solr. Indexing size is 8.3GB which is increasing day by day. While im searching in this index using java programming with multiple instance( more than 15 instance), the solr is not responding to the search

Newbie question on Deduplication overWriteDupes flag

2014-02-04 Thread aagrawal75
I had a configuration where I had overwriteDupes=false. Result: I got duplicate documents in the index. When I changed to overwriteDupes=false, the duplicate documents started overwriting the older documents. How do I achieve, add if not there, fail if duplicate is found. I though that

Re: Import data from mysql to sold

2014-02-04 Thread Gora Mohanty
On 4 February 2014 15:28, rachun rachun.c...@gmail.com wrote: please see below code.. dataConfig dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://localhost:3306/mydb01 user=root password=/ document

Re: Sentence Detection for Highlighting

2014-02-04 Thread Furkan KAMACI
Hi Markus; I've seen it but there is no documentation for it. 2014-02-04 Markus Jelsma markus.jel...@openindex.io: Boundary scanner using Java's break iterator: http://wiki.apache.org/solr/HighlightingParameters#hl.boundaryScanner -Original message- From:Furkan KAMACI

Re: Solr Searching Issue

2014-02-04 Thread Furkan KAMACI
Hi; Your index size is not much for a 24 GB machine. There should be any other problem for it. What is your document size and query rate per second? On the other hand how do you start up your Solr instance (which parameters do you use)? Thanks; Furkan KAMACI 2014-02-04 Sathya

Re: Solr Searching Issue

2014-02-04 Thread Sathya
Hi Furkan, I have index the subjects that containing only 1 to 10 words per subject. And query rate is minimum 7 seconds for one searching. And i have single solr instance only. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Searching-Issue-tp4115207p4115234.html

Re: Solr Searching Issue

2014-02-04 Thread Furkan KAMACI
Hi; Which JVM parameters do you use? Thanks; Furkan KAMACI 2014-02-04 Sathya sathia.blacks...@gmail.com: Hi Furkan, I have index the subjects that containing only 1 to 10 words per subject. And query rate is minimum 7 seconds for one searching. And i have single solr instance only.

Re: Solr Searching Issue

2014-02-04 Thread Jack Krupansky
Maybe you need a larger Java heap. -- Jack Krupansky -Original Message- From: Sathya Sent: Tuesday, February 4, 2014 6:11 AM To: solr-user@lucene.apache.org Subject: Solr Searching Issue Hi Friends, I am working in Solr 4.6.0 from last 2 months. i have indexed the data in solr.

Re: Special NGRAMish requirement

2014-02-04 Thread Furkan KAMACI
Hi; Do you want to use token length at boosting? I mean if longer tokens matched at EdgeNGram filtered field it will have a more boost than the others? Thanks; Furkan KAMACI 2014-02-04 Otis Gospodnetic otis.gospodne...@gmail.com: Hi, Can you provide an example, Alexander? Otis Solr

Re: How to index the data from db using Solandra

2014-02-04 Thread Jack Krupansky
Solandra was a prototype, an experiment. It was superceded by a commercial product - DataStax Enterprise (DSE). See: http://www.datastax.com/what-we-offer/products-services/datastax-enterprise Free support is available on Stack Overflow. DSE differs from Solandra in that the Lucene index is

AW: Special NGRAMish requirement

2014-02-04 Thread Lochschmied, Alexander
Hi Otis and everbody, I am not sure if Solr works this way, actually I doubt it... Let's say I look for ABC, and I setup my EdgeNGram to create everything up to one character: ABC, AB, A And maybe I have only two docs in the index with ABCD and AB, and they are also setup with that EdgeNGram

Re: Solr Searching Issue

2014-02-04 Thread Daniel Collins
You also said you have multiple instances ( 15) but are they all reading the same 8Gb data (in which case it must be static or you'd get locking problems) or is it partitioned/sharded somehow? I'd have the same questions as the others, query rates, how are your queries distributed over the

Re: Import data from mysql to sold

2014-02-04 Thread Alexei Martchenko
1) Yes, its the JDBC connection URL/URI. You can use a JNDI preconfigured datasource instead. It's all here http://wiki.apache.org/solr/DataImportHandler 2) It's a mapping: column is the database column and name is your solr destination field. You only need to specify name when both differ. DIH

Re: Solr ranking query..

2014-02-04 Thread Chris
Dear Varun, Thank you for your replies, I managed to get point 1 2 done, but for the boost query, I am unable to figure it out. Could you be kind enough to point me to an example or maybe advice a bit more on that one? Thanks for your help, Chris On Tue, Feb 4, 2014 at 3:14 PM, Varun Thacker

Re: Announce list

2014-02-04 Thread Chris Hostetter
: The proposed solutions so far (the Apache general updates RSS and the Solr : general list) are too noisy and not focused enough. Well, the feed URL previously posted just happened to be for all projects, there are also per project based feeds if you want more focused RSS...

Re: need help in understating solr cloud stats data

2014-02-04 Thread Mark Miller
I think that is silly. We can still offer per shard stats *and* let a user easily see stats for a collection without requiring they jump hoops or use a specific monitoring solution where someone else has already jumped hoops for them. You don’t have to guess what ops people really want -

Re: need help in understating solr cloud stats data

2014-02-04 Thread Otis Gospodnetic
+101 for more stats. Was just saying that trying to pre-aggregate them along multiple dimensions is probably best left out of Solr. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Tue, Feb 4, 2014 at 10:49 AM, Mark Miller

API to get documents imported in dataimport from EventListener.onEvent(Context cntxt)

2014-02-04 Thread Dileepa Jayakody
Hi All, Is there a way to retrieve the documents being imported in a dataimport request from a EventListener configured to run at onImportEnd? I need to get the set of values of the field:content of all the documents imported to perform an enhancement task. Is there a way to retrieve the

Re: need help in understating solr cloud stats data

2014-02-04 Thread Walter Underwood
I agree that sorting and filtering stats in Solr is not a good idea. There is certainly some use in aggregation, though. One request to /admin/mbeans replaces about 50 JMX requests. Is anybody working on https://issues.apache.org/jira/browse/SOLR-4735? wunder On Feb 4, 2014, at 8:13 AM, Otis

Lowering query time

2014-02-04 Thread Joel Cohen
Hi all, I'm on my way to migrating from a proprietary search technology (Endeca) to SolrCloud 4.6.1. I'm running 4 servers with 8 cores (2 threads per core with hyperthreading) and 64 Gb of RAM per server. Each server has 4 shards and each shard is running in a separate Tomcat instance with a 4

Re: Lowering query time

2014-02-04 Thread Yonik Seeley
On Tue, Feb 4, 2014 at 12:12 PM, Joel Cohen joel.co...@bluefly.com wrote: I'm trying to get the query time down to ~15 msec. Anyone have any tuning recommendations? I guess it depends on what the slowest part of the query currently is. If you are faceting, it's often that. Also, it's often a

Re: Lowering query time

2014-02-04 Thread Joel Cohen
1. We are faceting. I'm not a developer so I'm not quite sure how we're doing it. How can I measure? 2. I'm not sure how we'd force this kind of document partitioning. I can see how my shards are partitioned by looking at the clusterstate.json from Zookeeper, but I don't have a clue on how to get

Re: Lowering query time

2014-02-04 Thread Yonik Seeley
On Tue, Feb 4, 2014 at 1:43 PM, Joel Cohen joel.co...@bluefly.com wrote: 1. We are faceting. I'm not a developer so I'm not quite sure how we're doing it. How can I measure? Add debugQuery=true to the request and look at the timings of various components. 2. I'm not sure how we'd force this

Re: Lowering query time

2014-02-04 Thread Jack Krupansky
Add the debug=true parameter to some test queries and look at the timing section to see which search components are taking the time. Traditionally, highlighting for large documents was a top culprit. Are you returning a lot of data or field values? Sometimes reducing the amount of data

Re: Lowering query time

2014-02-04 Thread Joel Cohen
I plan on adding the debug flag to my queries and collecting that from my logs. I don't think we're using highlighting. Would anyone be able to tell from a raw query? q=%7B%21q.op%3DAND%7D*defType=edismaxqf=color_en%5E2.0+q_vendorColor_en%5E17.0+q_bullet2_en%5E19.0+v

RE: Lowering query time

2014-02-04 Thread Alexey Kozhemiakin
Btw timing for distributed requests are broken at this moment, it doesn't combine values from requests to shards. I'm working on a patch. https://issues.apache.org/jira/browse/SOLR-3644 -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Tuesday, February

Re: Duplicate Facet.FIelds cause same results, should dedupe?

2014-02-04 Thread Chris Hostetter
: facet.field=prac_spec_heirfacet.field=prac_spec_heir ... : Thoughts? Seems like a new bug in 4.6 ? Nope ... it's always been like that. We could concievably dedup, but that seems like unneccessary cycles in most cases -- if the client asks for redundent faceting, the client gets

Re: weird exception on update

2014-02-04 Thread Chris Hostetter
: We found out that: : : 1. this happens iff on two cores inside the same container there is a : query parser defined via defType. : 2. After removing index files on one of the cores, the delete by query : works just fine. Right after restarting the container, the same query fails.

Re: Lowering query time

2014-02-04 Thread Alexandre Rafalovitch
I suspect faceting is the issue here. The actual query you have shown seem to bring back a single document (or a single set of document for a product): fq=id:(320403401) On the other hand, you are asking for 4 field facets: facet.field=q_virtualCategory_ss facet.field=q_brand_s

Re: Announce list

2014-02-04 Thread Alexandre Rafalovitch
On Tue, Feb 4, 2014 at 10:40 PM, Chris Hostetter hossman_luc...@fucit.org wrote: Well, the feed URL previously posted just happened to be for all projects, there are also per project based feeds if you want more focused RSS... https://projects.apache.org/feeds/rss/solr.xml I think this is

Max Limit to Schema Fields - Solr 4.X

2014-02-04 Thread Mike L.
  solr user group -       I'm afraid I may have a scenario where I might need to define a few thousand fields in Solr. The context here is, this type of data is extremely granular and unfortunately cannot be grouped into logical groupings or aggregate fields because there is a need to know

Re: Import data from mysql to sold

2014-02-04 Thread rachun
I just would like to thank you for all guru's comments ;) Now i got it work Thank you for sharing and helps. _/|\_ Chun. -- View this message in context: http://lucene.472066.n3.nabble.com/Import-data-from-mysql-to-sold-tp4114982p4115431.html Sent from the Solr - User mailing list archive

SolrCloud fails to create new collections

2014-02-04 Thread Ray Cheng
Hi, Our SolrCloud with Solr 4.6.0 has two replicas and uses three external ZooKeeper servers. We have created 20 collections successfully in the last two weeks. Then, creating new collections started failing since yesterday. New collections are created with curl like this one: curl

Solr Deduplication use of overWriteDupes flag

2014-02-04 Thread Amit Agrawal
Hello, I had a configuration where I had overwriteDupes=false. I added few duplicate documents. Result: I got duplicate documents in the index. When I changed to overwriteDupes=true, the duplicate documents started overwriting the older documents. Question 1: How do I achieve, [add if not

Re: Max Limit to Schema Fields - Solr 4.X

2014-02-04 Thread Jack Krupansky
What will your queries be like? Will it be okay if they are relatively slow? I mean, how many of those 100 fields will you need to use in a typical (95th percentile) query? -- Jack Krupansky -Original Message- From: Mike L. Sent: Tuesday, February 4, 2014 10:00 PM To:

Re: Solr Searching Issue

2014-02-04 Thread Sathya
Hi, Yes all the instances are reading the same 8GB data at a time. The java search programs( 15 instances) are running in different machines, different JVM and they accessing the solr server machine(Ubuntu 64 bit). And the solr Index is not shard. The query rates are too poor(more than 5 seconds

Re: Max Limit to Schema Fields - Solr 4.X

2014-02-04 Thread Mike L.
Hey Jack - Two types of queries: A) Return all docs that have a match for a particular value from a particular field (fq=fieldname:value). Because of this I feel Im tied to defining all the fields. No particular field matters more than another - depends on the search context so hard to

RE: SolrCloud multiple data center support

2014-02-04 Thread Darrell Burgan
Interesting about the Zookeeper quorum problem. What if we were to run three Zookeepers in our primary data center and four in the backup data center. If we failed over, we wouldn't have a quorum, but we could kill one of the Zookeepers to restore a quorum, couldn't we? If we did extend the

RE: SolrCloud multiple data center support

2014-02-04 Thread Darrell Burgan
Thanks - I was unaware of Flume and will investigate it. It looks like it has specific features for replicating Solr data? Have you or has anyone on the list used it for this purpose? Thanks again, Darrel -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent:

Re: Max Limit to Schema Fields - Solr 4.X

2014-02-04 Thread Alexandre Rafalovitch
You could probably manage the schema by using dynamic fields. Also, enable lazy loading to avoid loading the values of the fields you do not care about. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the

Re: Solr Searching Issue

2014-02-04 Thread Shawn Heisey
On 2/4/2014 9:49 PM, Sathya wrote: Yes all the instances are reading the same 8GB data at a time. The java search programs( 15 instances) are running in different machines, different JVM and they accessing the solr server machine(Ubuntu 64 bit). And the solr Index is not shard. The query rates

Re: SolrCloud multiple data center support

2014-02-04 Thread Shawn Heisey
On 2/4/2014 10:14 PM, Darrell Burgan wrote: Interesting about the Zookeeper quorum problem. What if we were to run three Zookeepers in our primary data center and four in the backup data center. If we failed over, we wouldn't have a quorum, but we could kill one of the Zookeepers to restore

Re: Max Limit to Schema Fields - Solr 4.X

2014-02-04 Thread Shawn Heisey
On 2/4/2014 8:00 PM, Mike L. wrote: I'm just wondering here if there is any defined limit to how many fields can be created within a schema? I'm sure the configuration maintenance of a schema like this would be a nightmare, but would like to know if its at all possible in the first place

Re: Solr Searching Issue

2014-02-04 Thread Sathya
Hi Shawn, I am running single instance solr and the JVM heap space is minimum 6.3gb and maximum 24.31gb. Nothing is running to complete the 24gb except tomcat server. I have only 2 copyField entries only. On Wed, Feb 5, 2014 at 11:49 AM, Shawn Heisey-4 [via Lucene]

Re: Solr ranking query..

2014-02-04 Thread Varun Thacker
Hi Chris, An example for point 3 could be - boost=recip(field(domainRank),0.1,1,1) http://wiki.apache.org/solr/FunctionQuery#recip recip(x,m,a,b) implementing a/(m*x+b). m,a,b are constants, x is any numeric field or arbitrarily complex function. So with these values when domainRank is 1 it