Re: Restrict/change numFound solr result

2013-07-08 Thread Ralf Heyde
Can you explain a little bit more, what you are trying to do? I don't get, what you want to do. On 07/06/2013 08:39 AM, aniljayanti wrote: Hi, I am working on solr 3.3. i am getting total 120 records with below query, in response xml numFound is showing 540 records.

Re: [Announcement] Norch- a search engine for node.js

2013-07-08 Thread Fergus McDowall
William: Geosearch: coming soon! F On Sun, Jul 7, 2013 at 6:29 AM, William Bell billnb...@gmail.com wrote: Can it do Geo Spatial searching? (i.e. Find documents within 10 miles of a lat,long?) On Fri, Jul 5, 2013 at 12:53 PM, Fergus McDowall fergusmcdow...@gmail.comwrote: Here is

SolrCloud vs Distributed Solr

2013-07-08 Thread Flavio Pompermaier
Hi to all, I started following this mailing list about 1 month ago and I read many threads about SolrCloud and distributed Solr. I just want to check if I understood correctly and, if so, ask for some architectural decision I have to take: 1) At the moment, in order to design a scalable Solr

Re: Surround query parser not working?

2013-07-08 Thread Abeygunawardena, Niran
Hi, Thanks. I found out that my issue was the default field (df) was being ignored and I had to specify the parameter by adding df=text in the URL. Thank you for updating the wiki page on the surround parser: http://wiki.apache.org/solr/SurroundQueryParser Hopefully, ordered proximity searches

Atomic updates and indexed fields

2013-07-08 Thread Bram Van Dam
Howdy, Atomic updates only work on stored fields. When submitting an update, any non-stored fields are apparently emptied. Making all fields stored is simply not possible from a performance point of view in my case. Neither is resubmitting all fields. Are there any plans to change this

Re: Are the XML element names in schema.xml case sensitive?

2013-07-08 Thread Alexandre Rafalovitch
But not dynamicField or any others? Regards, Alex On 7 Jul 2013 23:39, Jack Krupansky j...@basetechnology.com wrote: Yes, the XML element names (tags) and attribute names are all case sensitive, but... Solr has a special hack for fieldtype as well as fieldType. -- Jack Krupansky

Re: Restrict/change numFound solr result

2013-07-08 Thread aniljayanti
Hi Erick, thanks for reply, after querying using solr getting 540 records, but i had a requirement to get only 120 records even though it may have greater than 120. After getting results i am calculating the paging on the basis of *numFound=120*.That is why i am checking this. if i take 12 record

Re: Restrict/change numFound solr result

2013-07-08 Thread aniljayanti
Hi Ralf, thanks for reply, I have 540 records as a solr result. In that i want only 120 records (result name=response numFound=120 start=0), based on the numFound node value i am calculating paging concept. i want to show 12 records per page then (120/12 = 10 pages) i would have 10 pages. Right.

Re: Custom Hashing

2013-07-08 Thread Erick Erickson
You have to tell us _how_ it's not working to get a meaningful answer. Perhaps you could review: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Sun, Jul 7, 2013 at 3:05 PM, lampa24 krogli...@gmail.com wrote: Hello , I use Solr Cloud 4.3.1. We use query join , so we need index

Re: SolrCloud vs Distributed Solr

2013-07-08 Thread Erick Erickson
Flavio: I think you're missing a critical bit about SolrCloud, namely Zookeeper (ZK), see here on the SolrCloud page for a start: http://wiki.apache.org/solr/SolrCloud#ZooKeeper You'll notice that each Solr node, when it is started, requires the address of your ZK ensemble, NOT a solr node. That

Re: Atomic updates and indexed fields

2013-07-08 Thread Erick Erickson
Bram: see: https://issues.apache.org/jira/browse/LUCENE-4258 I'm sure the people working on this would gladly get all the help they can. WARNING: I suspect (although I haven't looked myself) that this is very hairy code G. bq: Making all fields stored is simply not possible from a performance

Re: Restrict/change numFound solr result

2013-07-08 Thread Erick Erickson
Just specify successively larger start=... parameters, When you want page 1 you specify start=0rows=10 page 2: start=10rows=10 and so on. see: http://wiki.apache.org/solr/CommonQueryParameters#start Best Erick On Mon, Jul 8, 2013 at 5:36 AM, aniljayanti aniljaya...@yahoo.co.in wrote: Hi Ralf,

Is it possible to facet on existence of a field?

2013-07-08 Thread adfel70
I have a field that's only indexed in some of the documents. Can I create a boolean facet on this field by its existence? for instance: yes(124) no(479) Note that the fields' value is not facetable because all its values are unique most of the time. I just want to facet on the question whether

Solr 3.5.0 - logging per core?

2013-07-08 Thread Rafał Radecki
Hi All. I have some solr 3.5.0 instances on CentOS 6 x86_64. Currently logging is by going to syslog (logger), for all my cores to one shared logfile. I would like to have a separate logfile for each core. My solr process is started from a simple shell script: daemon $JAVA -Xms25g -Xmx25g

Re: Solr 4.x union of cross-joins

2013-07-08 Thread mihaela olteanu
Here is exactly the data that I'm working with and the results for some tests that I performed: parent       child1       child2       id KEY_S COMMENT_T TYPE_S id KEY_S COMMENT_T TYPE_S id KEY_S COMMENT_T TYPE_S 1 1 ventilation test Parent 4 1 comment4 Child1 7 1 comment6 Child2 2 2 comment2

Re: Is it possible to facet on existence of a field?

2013-07-08 Thread Yonik Seeley
On Mon, Jul 8, 2013 at 8:18 AM, adfel70 adfe...@gmail.com wrote: I have a field that's only indexed in some of the documents. Can I create a boolean facet on this field by its existence? for instance: yes(124) no(479) Note that the fields' value is not facetable because all its values are

Re: SolrCloud vs Distributed Solr

2013-07-08 Thread Flavio Pompermaier
Thanks for the detailed response Erik, you helped me a lot in clarifying many Solr concepts! Best, Flavio On Mon, Jul 8, 2013 at 1:59 PM, Erick Erickson erickerick...@gmail.comwrote: Flavio: I think you're missing a critical bit about SolrCloud, namely Zookeeper (ZK), see here on the

Re: Solr 4.x union of cross-joins

2013-07-08 Thread mihaela olteanu
It seems that I lost the formatting of the data.  parent idKEY_SCOMMENT_TTYPE_S 11ventilation testParent 22comment2Parent 33comment3Parent child1 idKEY_SCOMMENT_TTYPE_S 41comment4Child1 52ventilation testChild1 63comment5Child1 child2 idKEY_SCOMMENT_TTYPE_S 71comment6Child2 82comment7Child2

Solr limitations

2013-07-08 Thread Marcelo Elias Del Valle
Hello everyone, I am trying to search information about possible solr limitations I should consider in my architecture. Things like max number of dynamic fields, max number o documents in SolrCloud, etc. Does anyone know where I can find this info? Best regards, -- Marcelo Elias Del

Data Import from RDBMS+File

2013-07-08 Thread Raheel Hasan
Hi everyone, I am looking for a way to import/index data such that i load data from table_1 and instead of joining from table_2, i import the rest of the joined data from a file instead. The name of the file comes from a field from table_1 Is it possible? and is it easily possible? --

Re: Atomic updates and indexed fields

2013-07-08 Thread Bram Van Dam
see: https://issues.apache.org/jira/browse/LUCENE-4258 I'm sure the people working on this would gladly get all the help they can. WARNING: I suspect (although I haven't looked myself) that this is very hairy code G. Ah excellent! Thanks! Exactly what I was looking for. Looks like this has been

Re: Solr limitations

2013-07-08 Thread Jack Krupansky
Other that the per-node/per-collection limit of 2 billion documents per Lucene index, most of the limits of Solr are performance-based limits - Solr can handle it, but the performance may not be acceptable. Dynamic fields are a great example. Nothing prevents you from creating a document with,

Every collection.reload makes zookeeper think shards are down

2013-07-08 Thread adfel70
Hi each time I reload a collection via collections API, zookeeper thinks that all the shards in the collection are down. It marks them as down and I can't send requests. Why thinks? because if I manually edit clusterstate.json file and set 'state' value to 'active', they come back up and

Re: Data Import from RDBMS+File

2013-07-08 Thread Alexandre Rafalovitch
Did you have a chance to look at DIH with nested entities yet? That's probably the way to go to start out. Or a custom client, of course. Or, ETL solutions that support Solr (e.g. Apache Flume - not personally tested yet). Regards, Alex. Personal website: http://www.outerthoughts.com/

Re: Atomic updates and indexed fields

2013-07-08 Thread Jack Krupansky
Consider keeping your stored/updatable fields in a separate, parallel collection. It makes queries a multi-step operation, but gives you a lot more flexibility. In some cases (but not all), external file fields can eliminate the need to directly update indexed documents. Or, consider a

Re: Surround query parser not working?

2013-07-08 Thread Jack Krupansky
Yes, you should be able to used nested query parsers to mix the queries. Solr 4.1(?) made it easier. -- Jack Krupansky -Original Message- From: Abeygunawardena, Niran Sent: Monday, July 08, 2013 7:00 AM To: solr-user@lucene.apache.org Subject: Re: Surround query parser not working?

Re: Data Import from RDBMS+File

2013-07-08 Thread Raheel Hasan
On this page (http://wiki.apache.org/solr/DataImportHandler), I cant see how its possible. Perhaps there is another guide.. Basically, this is what I am doing: Index data from multiple tables into Solr (see here http://wiki.apache.org/solr/DIHQuickStart). I need to skip 1 very big heavy table as

Re: Data Import from RDBMS+File

2013-07-08 Thread Alexandre Rafalovitch
http://wiki.apache.org/solr/DataImportHandler#PlainTextEntityProcessor or http://wiki.apache.org/solr/DataImportHandler#LineEntityProcessor ? The file name gets exposed as a ${entityname.fieldname} variable. You can probably copy/manipulate it with a transformer on the external entity before it

Re: Data Import from RDBMS+File

2013-07-08 Thread Raheel Hasan
ok great. can I use this EntityProcessor within JdbcDataSource? Like this: dataConfig dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://localhost/db_1 user=root password= autoCommit=true

Re: Every collection.reload makes zookeeper think shards are down

2013-07-08 Thread Mark Miller
It's a known bug, fix coming in 4.4, 4.4 likely coming within a couple weeks. https://issues.apache.org/jira/browse/SOLR-4805 - Mark On Jul 8, 2013, at 10:30 AM, adfel70 adfe...@gmail.com wrote: Hi each time I reload a collection via collections API, zookeeper thinks that all the shards

Re: Solr cloud date based paritioning

2013-07-08 Thread Mark Miller
The strategy doesn't require putting all the recent data on a single node. What has been suggested is collection based - the most recent data will simply be in it's own collection, that may or may not be on a single node. This is pretty much always going to be advantageous for time series data.

Re: Data Import from RDBMS+File

2013-07-08 Thread Alexandre Rafalovitch
You can mix and match the data sources in nested entities, yes. Just make sure that you declare your data sources at the top and refer to them properly. As per documentation: Ensure that the dataSource is of type DataSourceReader (FileDataSource, URLDataSource). So you need to declare one at the

Re: Using the Schema API from SolrJ

2013-07-08 Thread Shawn Heisey
On 7/6/2013 4:27 PM, Steven Glass wrote: Thanks for your response. But it seems like there should be a way to issue the equivalent of http://localhost:8983/solr/schema/version which returns { responseHeader:{ status:0, QTime:4}, version:1.5} from the server.

Re: Are the XML element names in schema.xml case sensitive?

2013-07-08 Thread Jack Krupansky
Nope. -- Jack Krupansky -Original Message- From: Alexandre Rafalovitch Sent: Monday, July 08, 2013 7:20 AM To: solr-user@lucene.apache.org Subject: Re: Are the XML element names in schema.xml case sensitive? But not dynamicField or any others? Regards, Alex On 7 Jul 2013 23:39,

SolrJ and SolrCloud

2013-07-08 Thread Ali, Saqib
Hello all, We have an app that uses the SolrJ and instantiates using HttpSolrServer. Now that we would like to move to SolrCloud, can we still use the same app, or do we HAVE to switch to CloudSolrServer server = new CloudSolrServer(?); right away? Or will point to one instance using

Re: SolrJ and SolrCloud

2013-07-08 Thread Mark Miller
On Jul 8, 2013, at 1:40 PM, Ali, Saqib docbook@gmail.com wrote: Hello all, We have an app that uses the SolrJ and instantiates using HttpSolrServer. Now that we would like to move to SolrCloud, can we still use the same app, or do we HAVE to switch to CloudSolrServer server = new

Re: SolrJ and SolrCloud

2013-07-08 Thread Ali, Saqib
Thanks Mark! On Mon, Jul 8, 2013 at 10:46 AM, Mark Miller markrmil...@gmail.com wrote: On Jul 8, 2013, at 1:40 PM, Ali, Saqib docbook@gmail.com wrote: Hello all, We have an app that uses the SolrJ and instantiates using HttpSolrServer. Now that we would like to move to

Re: Concurrent Modification Exception

2013-07-08 Thread adityab
For reference https://issues.apache.org/jira/browse/SOLR-5019 -- View this message in context: http://lucene.472066.n3.nabble.com/Concurrent-Modification-Exception-tp4074371p4076330.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr limitations

2013-07-08 Thread Marcelo Elias Del Valle
Jack, Thanks a lot for your answers. I guess I just had heard on Cassandra Summit that Solr can't support more than 1024 dynamic fields and it might be possible in my case, that's why I asked this question. However, your answer was very complete and made me think in a lot of things. The

joins in solr cloud - good or bad idea?

2013-07-08 Thread Marcelo Elias Del Valle
Hello all, I am using Solr Cloud today and I have the following need: - My queries focus on counting how many users attend to some criteria. So my main document is user (parent table) - Each user can access several web pages (a child table) and each web page might have several

Re: Solr limitations

2013-07-08 Thread Jack Krupansky
That 1024 limit of the DataStax Enterprise packaging of Solr is going to be relaxed in a coming release - you will be able to have more dynamic fields, but... going wild has memory and performance implications anyway. That limit is the number of populated fields in a single document - different

Re: What are the options for obtaining IDF at interactive speeds?

2013-07-08 Thread Kathryn Mazaitis
Hi All, Resolution: I ended up cheating. :P Though now that I look at it, I think this was Roman's second suggestion. Thanks! Since the application that will be processing the IDF figures is located on the same machine as SOLR, I opened a second IndexReader on the lucene index and used

Re: What are the options for obtaining IDF at interactive speeds?

2013-07-08 Thread Roman Chyla
Hi, I am curious about the functional query, did you try it and it didn't work? or was it too slow? idf(other_field,field(term)) Thanks! roman On Mon, Jul 8, 2013 at 4:34 PM, Kathryn Mazaitis ka...@rivard.org wrote: Hi All, Resolution: I ended up cheating. :P Though now that I look at

Re: joins in solr cloud - good or bad idea?

2013-07-08 Thread Roman Chyla
Hello, The joins are not the only idea, you may want to write your own function (ValueSource) that can implement your logic. However, I think you should not throw away the regex idea (as being slow), before trying it out - because it can be faster than the joins. Your problem is that the number

solr way to exclude terms

2013-07-08 Thread Angela Zhu
Is there a solr way to remove any result from the list search results that contain a term in a excluding list? For example, suppose I search for apple and get 5 documents contains it, and my excluding list is something like ['bad', 'wrong', 'staled']. Out of the 5 documents, 3 has a word in this

Re: solr way to exclude terms

2013-07-08 Thread Roman Chyla
One of the approaches is to index create a new field based on the stopwords (ie accept only stopwords :)) - ie. if the documents contains them, you index 1 - and use a q=applefq=bad_apple:0 This has many limitations (in terms of flexibility), but it will be superfast roman On Mon, Jul 8, 2013

SolrCloud on Jboss

2013-07-08 Thread Ali, Saqib
Hello, Does anyone have step-by-step instructions for running SolrCloud on Jboss? Thanks

Re: joins in solr cloud - good or bad idea?

2013-07-08 Thread Marcelo Elias Del Valle
Roman, The video was very clarifying and I realized block joins would be a great fit for my problem. However, I got worried about the size of the block... I could have 10 million childs for 1 parent, for instance. Althout this could stay in the same shard, do you guys think it would be a huge

Solr 3.6 optimize and field cache question

2013-07-08 Thread Joshi, Shital
Hi, We have Solr 3.6 set up with master and two slaves, each one with 70GB JVM. We run into java.lang.OutOfMemoryError when we cross 250 million documents. Every time this happens we purge documents, bring it below 200 million and bounce both slaves. We have facets on 14 fields. We usually

Re: solr way to exclude terms

2013-07-08 Thread Jack Krupansky
What is the actual use case? In other words, why is the list so long? Maybe exclusion by keyword is not the proper solution... but we need what the underlying problem is. Is this for document access control? -- Jack Krupansky -Original Message- From: Angela Zhu Sent: Monday, July

Indexing fails for docs with high Latin1 chars

2013-07-08 Thread John Randall
I'm new to Solr, so I'm probably missing something. So far I've successfully indexed .xml docs with low Ascii chars. However when I try to add a doc that has Latin1 chars with diacritics, it fails. I've tried using the Jetty exampledocs post.jar, as well as using curl and directly from a

Re: Indexing fails for docs with high Latin1 chars

2013-07-08 Thread Jack Krupansky
Maybe you need to add ; charset=UTF-8 to your Content-type: curl http://localhost:8080/solr/update/?commit=truestream.file=c:/solr/tml/exampledocs/57917486.xmlstream.contentType=application/xml; charset=UTF-8” -- Jack Krupansky -Original Message- From: John Randall Sent: Monday,

Re: Indexing fails for docs with high Latin1 chars

2013-07-08 Thread Shawn Heisey
On 7/8/2013 4:43 PM, John Randall wrote: I'm new to Solr, so I'm probably missing something. So far I've successfully indexed .xml docs with low Ascii chars. However when I try to add a doc that has Latin1 chars with diacritics, it fails. I've tried using the Jetty exampledocs post.jar, as

ANNOUNCE: CFP Lucene/Solr Revolution EU 2013 (Deadline August 2nd)

2013-07-08 Thread Chris Hostetter
(NOTE: cross-posted to variuous lists, please reply only to general@lucene w/ any questions or follow ups) The Call for Papers for Lucene/Solr Revolution EU 2013 is currently open. http://www.lucenerevolution.org/2013/call-for-papers Lucene/Solr Revolution is the biggest open source

Re: Indexing fails for docs with high Latin1 chars

2013-07-08 Thread Jack Krupansky
Right, the charset must agree with the charset of the program that wrote the file. -- Jack Krupansky -Original Message- From: Shawn Heisey Sent: Monday, July 08, 2013 7:43 PM To: solr-user@lucene.apache.org Subject: Re: Indexing fails for docs with high Latin1 chars On 7/8/2013 4:43

Re: Indexing fails for docs with high Latin1 chars

2013-07-08 Thread John Randall
I tried that. It didn't work. I forgot to mention in my first email that I'm using Solr 3.6. Would that make a difference? From: Jack Krupansky j...@basetechnology.com To: solr-user@lucene.apache.org; John Randall jmr...@yahoo.com Sent: Monday, July 8, 2013

Re: Indexing fails for docs with high Latin1 chars

2013-07-08 Thread John Randall
Per Jack's suggestion, I changed the heading in the .xml file to ?xml version=1.0 encoding=LATIN1? and it worked. Thanks so much guys! From: Shawn Heisey s...@elyograg.org To: solr-user@lucene.apache.org Sent: Monday, July 8, 2013 7:43 PM Subject: Re: Indexing

RE: Solr 3.6 optimize and field cache question

2013-07-08 Thread Michael Ryan
I'm 99% sure that the deleted docs will indeed use up space in the field cache, at least until the segments that those documents are in are merged - that is what an optimize will do. Of course, these segments will automatically be merged eventually, but it might take days for this to happen,

Moving replica from node to node?

2013-07-08 Thread Otis Gospodnetic
Hi, Solr(Cloud) currently doesn't have any facility to move a specific replica from one node to the other. How come? Is there a technical or philosophical reason, or just the 24 hours/day reason? Thanks, Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring --

Re: Moving replica from node to node?

2013-07-08 Thread Mark Miller
It's simply a sugar method that no one has gotten to yet. I almost have once or twice, but I always have moved onto other things before even starting. It's fairly simple to just start another replica on the TO node and then delete the replica on the FROM node, so not a lot of urgency. - Mark

Re: Solr 3.6 optimize and field cache question

2013-07-08 Thread Otis Gospodnetic
Hi, 70 GB heap and still OOMing? H sure, 14 fields for faceting, but still - 70 GB heap! Don't have source handy, but I quickly looked at FC src here - http://search-lucene.com/c/Lucene:core/src/java/org/apache/lucene/search/FieldCache.java - I see mentions of delete there, so I would

Re: Are the XML element names in schema.xml case sensitive?

2013-07-08 Thread Otis Gospodnetic
Woho, I love inconsistency in code! Not. :) Any idea why this is, Jack? Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Mon, Jul 8, 2013 at 12:49 PM, Jack Krupansky j...@basetechnology.com wrote: Nope. -- Jack Krupansky

Re: Lucene pass through for faceting in SOLR

2013-07-08 Thread Otis Gospodnetic
I think nobody's itching enough, but I think it would be great to have facet.method=lucene :) Paste from http://search-lucene.com/m/NzVKPC5C8x1 from Yonik: Would it make sense to add Lucene's faceting as an *additional* Solr faceting method? Maybe? I don't really know though - I haven't

Re: Find related words

2013-07-08 Thread Otis Gospodnetic
Not sure... but if you need Collocations/SIPs, you can try the non-free-but-cheaper-than-DIY http://sematext.com/products/key-phrase-extractor/index.html Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Sun, Jul 7, 2013 at 12:42

Re: Backup stops replication

2013-07-08 Thread Otis Gospodnetic
Hi, I don't recall any such changes around replication... Why not run the backup on the master? Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Sat, Jul 6, 2013 at 3:05 AM, Cool Techi cooltec...@outlook.com wrote: Hi, We

Re: Solr large boolean filter

2013-07-08 Thread Otis Gospodnetic
Hi Roman, I referred to something I called server-side named filters. It matches the feature described at http://www.elasticsearch.org/blog/terms-filter-lookup/ Would be a cool addition, IMHO. Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring --

Re: Solr large boolean filter

2013-07-08 Thread Roman Chyla
OK, thank you Otis, I *think* this should be easy to add - I can try. We were calling them 'private library' searches roman On Mon, Jul 8, 2013 at 11:58 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi Roman, I referred to something I called server-side named filters. It matches

Best way to call asynchronously - Custom data import handler

2013-07-08 Thread Learner
I wrote a custom data import handler to import data from files. I am trying to figure out a way to make asynchronous call instead of waiting for the data import response. Is there an easy way to invoke asynchronously (other than using futures and callables) ? public class