Re: solr java.lang.NullPointerException on select queries

2012-06-19 Thread avenka
For the first install, I copied over all files in the directory "example" into, let's call it, "install1". I did the same for "install2". The two installs run on different ports, use different jar files, are not really related to each other in any way as far as I can see. In particular, they are no

Re: Indexation Speed?

2012-06-19 Thread Lance Norskog
M. Della Bitta is right- we're not talking about post.jar, but starting Solr: java -xMx300m -jar start.jar On Tue, Jun 19, 2012 at 10:05 AM, Erick Erickson wrote: > Well, it _used_ to be defaulted in the code, but on looking at 3.6 it's seems > like it defaults to Integer.MAX_VALUE, so you're fi

Re: question about DIH

2012-06-19 Thread alex.wang
it's still not work in delta-import mode. and the result as follow: 0 15 data-config.xml delta-import idle 0 0 0 2012-06-20 10:48:16 2012-06-20 10:48:16 2012-06-20 10:48:17 2012-06-20 10:48:17 0 0 0:0:0.62 This response format is experimental. It is likely to change in the future. -

Re: How is it possible?

2012-06-19 Thread Erik Hatcher
You'll have to reindex. It's the only way out of this situation, sorry. Erik On Jun 19, 2012, at 18:05, Bruno Mannina wrote: > Le 20/06/2012 00:26, Rafał Kuć a écrit : >> Hello! >> >> Try using string type instead of text_general. That should help. >> > it's too late now no? > > I index

Re: How is it possible?

2012-06-19 Thread Bruno Mannina
Le 20/06/2012 00:26, Rafał Kuć a écrit : Hello! Try using string type instead of text_general. That should help. it's too late now no? I indexed around 18 000 000 docs. Do you think it could be the problem? (text_general instead of string)

Re: How is it possible?

2012-06-19 Thread Rafał Kuć
Hello! Try using string type instead of text_general. That should help. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch > Hi All, > In my Schema.xml I have: > stored="true" required="true"/> > > > patent-number > > And when I

Re: Solr v3.5.0 - numFound changes when paging through results on 8-shard cluster

2012-06-19 Thread Justin Babuscio
I believe that is the issue. We recently lost a physical server and a misinformed (due to weekend fire) sys admin moved one of the master shards. This caused the automated deployment scripts to change the order of publishing. When a rebuild followed the following day, we essentially wrote the sa

How is it possible?

2012-06-19 Thread Bruno Mannina
Hi All, In my Schema.xml I have: stored="true" required="true"/> patent-number And when I do on my Admin Page: patent-number:BE858458A1 then I get two identical documents ?!!! May be I run twice: java -jar post.jar BE.xml but as Patent-Number is a uniqueKey the second indexat

Re: Solr v3.5.0 - numFound changes when paging through results on 8-shard cluster

2012-06-19 Thread Chris Hostetter
: Confirming that there are no active records being written, the "numFound" : value is decreasing as we page through the results. 1) check that the "clones" of each shard are in fact identical (just look at the index files on each machine and make sure they are the same. 2) distributed searching

highlighting field boundary detection

2012-06-19 Thread Mike Sokolov
Does anybody know of a way to detect when the highlight snippet begins at the beginning of the field or ends at the end of the field using one of the standard highlighters shipped w/Solr? We'd like to display ellipses only when there is additional text surrounding the snippet in the original

Re: Solr v3.5.0 - numFound changes when paging through results on 8-shard cluster

2012-06-19 Thread Justin Babuscio
As I understand your problem, it sounds like you were using your master as part of your search cluster so the two distributed queries were returning conflicting numbers. In my scenario, our eight Masters are used for /updates & /deletes only. There are no queries issued to these nodes. When the

Re: Solr v3.5.0 - numFound changes when paging through results on 8-shard cluster

2012-06-19 Thread Shawn Heisey
On 6/19/2012 2:32 PM, Justin Babuscio wrote: 2) For the shards, we use the URL parameters, shards=s1/solr,s2/solr,s3/solr,...,s8/solr where s# point to a baremetal load balancer that routes the requests to one of the two slave shards. This most likely has nothing to do with your question a

Re: Solr v3.5.0 - numFound changes when paging through results on 8-shard cluster

2012-06-19 Thread Justin Babuscio
1) We have 1 core and we use the default search handler. 2) For the shards, we use the URL parameters, shards=s1/solr,s2/solr,s3/solr,...,s8/solr where s# point to a baremetal load balancer that routes the requests to one of the two slave shards. There is definitely the chance that on each

Re: Solr v3.5.0 - numFound changes when paging through results on 8-shard cluster

2012-06-19 Thread Yury Kats
On 6/19/2012 4:06 PM, Justin Babuscio wrote: > Solr v3.5.0 > 8 Master Shards > 2 Slaves Per Master > > Confirming that there are no active records being written, the "numFound" > value is decreasing as we page through the results. > > For example, > Page1 - numFound = 3683 > Page2 - numFound = 36

Re: How to change data subdirectory in Solr

2012-06-19 Thread Erick Erickson
Frankly, that seems like a heck of a lot of work relative to moving the indexes. _why_ do your bosses insist on this? And have you made it explicit to them that 1> this will cost considerable time to develop that you could spend putting in _useful_ features. 2> you'll have to maintain this cod

Solr v3.5.0 - numFound changes when paging through results on 8-shard cluster

2012-06-19 Thread Justin Babuscio
Solr v3.5.0 8 Master Shards 2 Slaves Per Master Confirming that there are no active records being written, the "numFound" value is decreasing as we page through the results. For example, Page1 - numFound = 3683 Page2 - numFound = 3683 Page3 - numFound = 3683 Page4 - numFound = 2866 Page5 - numFou

Solr with Tomcat on VPS

2012-06-19 Thread Hill Michael (NHQ-AC)
I am running Solr in a shared Tomcat v5.5.28 (I have access to all instances) on a Linux VPS server. When I set it all up, Tomcat starts properly and I can see that it has accesses my Solr Config directory properly. I can access the JSP pages if I reference them directly (http://mysite.com/solr/

Re: Indexation Speed?

2012-06-19 Thread Erick Erickson
Well, it _used_ to be defaulted in the code, but on looking at 3.6 it's seems like it defaults to Integer.MAX_VALUE, so you're fine And it's all deprecated in 4.x, will be gone Best Erick On Tue, Jun 19, 2012 at 7:07 AM, Bruno Mannina wrote: > Actually -Xmx512m and no effect > > Concerning

Re: How to change data subdirectory in Solr

2012-06-19 Thread Vitor M. Barbosa
My bosses are insisting on not changing our current Lucene folders. So I have to get back on this, can I overwrite Solr's CoreAdminHandler (or other classes) to change the way it'll look for the index directory? -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-change-dat

Re: Indexation Speed?

2012-06-19 Thread Otis Gospodnetic
Bruno, Look at SPM for Solr ( http://sematext.com/spm ) - very handy for understanding disk IO vs. CPU vs. JVM GC, etc. during indexing/performance testing. You could also play with ramBufferSizeMB in solrconfig.xml Otis  Performance Monitoring for Solr / ElasticSearch / HBase - http://se

Re: search for alphabetic version of numbers

2012-06-19 Thread Chris Hostetter
: I have the requirement to support searching for numbers with their : alphabetic or by digits. : For example, if we have a document with a field's value of '200', : if we search for "two hundred", that document should match. : : I haven't found anything like this yet. Do we have other option tha

Re: Indexation Speed?

2012-06-19 Thread Michael Della Bitta
Just an observation... The OP is tweaking the heap size on post.jar, but wouldn't some tuning of the container that Solr is running in also be in order? Michael Della Bitta Appinions, Inc. -- Where Influence Isn’t a Game. http://www.appinions.com

LanguageDetection inside of ExtractingRequestHandler

2012-06-19 Thread Martin Ruckli
Hi all, I just wanted to check if there is a demand for this feature. I had to implement this functionality for one of our customers and would like to contribute it. Here is the use case: We are using the ExtractingRequestHandler with the extractOnly=true flag set. With a request to this handle

Re: Multicore master-slaver replication in Solr Cloud

2012-06-19 Thread Mark Miller
On Jun 19, 2012, at 9:59 AM, fabio curti wrote: > Hi, > i tried to set a Multicore master-slaver replication in Solr Cloud found in > this post > http://pulkitsinghal.blogspot.it/2011/09/multicore-master-slave-replication-in.html > but > i get the following problem > > SEVERE: Error while trying

Re: Solr 1.4, slaves hang after replication from an just optimized master

2012-06-19 Thread Vadim Kisselmann
Forget to mention: After Tomcat-restart, the slaves still have an index with 300GB. After an manual replication command in UI, 100GB like master in a couple of seconds and all is ok. 2012/6/19 Vadim Kisselmann : > Hi folks, > > i have to look for an old live system with solr 1.4. > When i optimi

Solr 1.4, slaves hang after replication from an just optimized master

2012-06-19 Thread Vadim Kisselmann
Hi folks, i have to look for an old live system with solr 1.4. When i optimize an bigger index with round about 200GB(after optimize and cut, 100GB) and my slaves replicate the newest version after(!) optimize, they hang(all) with 100% in replication and they have at once circa 300GB index sizes.

Multicore master-slaver replication in Solr Cloud

2012-06-19 Thread fabio curti
Hi, i tried to set a Multicore master-slaver replication in Solr Cloud found in this post http://pulkitsinghal.blogspot.it/2011/09/multicore-master-slave-replication-in.html but i get the following problem SEVERE: Error while trying to recover. org.apache.solr.client.solrj.SolrServerException: Ser

Re: Solr spellchecking fails on sharded query

2012-06-19 Thread fabio curti
Hi, i found this article about your issue. http://wiki.apache.org/solr/SpellCheckComponent#Distributed_Search_Support Fabio 2012/6/19 Eric Wilson > I have a Solr application that is distributed into 11 shards, using Solr > version 4.0.0.2011.07.26.16.34.16 > > In the solrconfig.xml for each sh

Solr spellchecking fails on sharded query

2012-06-19 Thread Eric Wilson
I have a Solr application that is distributed into 11 shards, using Solr version 4.0.0.2011.07.26.16.34.16 In the solrconfig.xml for each shard, I have configured a spellcheck component: textSpell cn_spell company_name_spell 0.0001 true

Re: Indexation Speed?

2012-06-19 Thread François Schiettecatte
There is a lot of good information about that on the web, just google for 'ubuntu performance monitor' Also the ubuntu website has a pretty good help section: https://help.ubuntu.com/ and a community wiki: https://help.ubuntu.com/community Cheers François On Jun 19, 2012, at

No spellcheck if not enabled in default request handler

2012-06-19 Thread Markus Jelsma
Hi. Just now i deactivated the spellcheck component in the default request handler by commenting out the last-components arr. Our other search request handler has spellcheck enabled and uses distributed search. After deactivating the component in the default handler the queries never yield sugg

Re: Indexation Speed?

2012-06-19 Thread Bruno Mannina
Linux Ubuntu :) since 2 months ! so I'm a new in this world :) Le 19/06/2012 15:01, François Schiettecatte a écrit : Well that depends on the platform you are on, you did not mention that. If you are using linux, you could use atop ( http://www.atoptool.nl/ ), or top, or iostat or stat, or al

Re: Indexation Speed?

2012-06-19 Thread François Schiettecatte
Well that depends on the platform you are on, you did not mention that. If you are using linux, you could use atop ( http://www.atoptool.nl/ ), or top, or iostat or stat, or all four. Cheers François On Jun 19, 2012, at 8:55 AM, Bruno Mannina wrote: > CPU is not used, just 50-60% sometimes d

Re: Indexation Speed?

2012-06-19 Thread Bruno Mannina
CPU is not used, just 50-60% sometimes during the process but How can I check IO HDD ? Le 19/06/2012 14:13, François Schiettecatte a écrit : Just a suggestion, you might want to monitor CPU usage and disk I/O, there might be a bottleneck. Cheers François On Jun 19, 2012, at 7:07 AM, Bruno M

Re: IndexWrite in Lucene/Solr 3.5 is slower?

2012-06-19 Thread Torsten Krah
May be related to https://issues.apache.org/jira/browse/LUCENE-3418 which does ensure things are really written; if you do commit very often, you may see this sort of performance loss (at least me did in my junit test where i do commit very often and 3.3 switch to 3.4 really hurts here at test time

Re: What we loose if we use ClassicTokenizer instead of StandardTokenizer

2012-06-19 Thread Alok Bhandari
thanks for the reply. Yes I have started the admin/analysis thing before you suggested but just wanted to know if out of the box anything specific is notsupported/supported by the tokenizers specified. -- View this message in context: http://lucene.472066.n3.nabble.com/What-we-loose-if-we-use-C

Re: Indexation Speed?

2012-06-19 Thread François Schiettecatte
Just a suggestion, you might want to monitor CPU usage and disk I/O, there might be a bottleneck. Cheers François On Jun 19, 2012, at 7:07 AM, Bruno Mannina wrote: > Actually -Xmx512m and no effect > > Concerning maxFieldLength, no problem it's commented > > Le 19/06/2012 13:02, Erick Erick

Re: Indexation Speed?

2012-06-19 Thread Bruno Mannina
Actually -Xmx512m and no effect Concerning maxFieldLength, no problem it's commented Le 19/06/2012 13:02, Erick Erickson a écrit : Then try -Xmx600M next try -Xmx900M etc. The idea is to bump things on separate runs. But be a little cautious here. Look in your solrconfig.xml file, you'll se

Re: question about DIH

2012-06-19 Thread Erick Erickson
Solr indexes all times as UTC (that's what the "Z' is all about). You ought to be able to get your SQL to return UTC time rather than local, and that should fix things up for you. Best Erick On Tue, Jun 19, 2012 at 6:24 AM, alex.wang wrote: > hi all: >    when i import the data from db to solr.

Re: Indexation Speed?

2012-06-19 Thread Erick Erickson
Then try -Xmx600M next try -Xmx900M etc. The idea is to bump things on separate runs. But be a little cautious here. Look in your solrconfig.xml file, you'll see a commented-out line 1 The default behavior for Solr/Lucene is to index the first 10,000 tokens (not characters, think of tokens

Re: What we loose if we use ClassicTokenizer instead of StandardTokenizer

2012-06-19 Thread Erick Erickson
You're asking us to predict the future, which if I could I'd be rich enough to build a mansion. If it's not marked as deprecated in 4.x or trunk, so it doesn't look like there's any plans to deprecate it. Although what the future holds is a good question.. I'd _strongly_ advise that you look at th

question about DIH

2012-06-19 Thread alex.wang
hi all: when i import the data from db to solr. and solr changed the value with timezone. eg, the original value is 16/02/2012 12:05:16 , changed to 1/02/2012 04:05:06 . and i add the 8 hours in my sql . it's be correct. but when i use delta-import mode to add index. it's not working. the

Re: delete by query don't work

2012-06-19 Thread vidhya
In order to clear all the indexed data please try to use this code private void Btn_Delete_Click(object sender, EventArgs e) { var solrUrl = this.textBoxSolrUrl.Text; indexer.FixtureSetup(solrUrl); indexer.Delete(); MessageBox.Sho

Re: indexing a xml file

2012-06-19 Thread vidhya
FATAL: Solr returned an error #400 ERROR:unknown field >'name' This issue is due to data type mismatch in both solr(schema.xml) and in coding part(Adding documents). Try to make both the fields should be similar. -- View this message in context: http://lucene.472066.n3.nabble.com/indexing-a-xml

Re: Indexation Speed?

2012-06-19 Thread Bruno Mannina
Like that? java -Xmx300m -jar post.jar myfile.xml Le 19/06/2012 11:11, Lance Norskog a écrit : Ah! Java memory size is a java command line option: http://javahowto.blogspot.com/2006/06/6-common-errors-in-setting-java-heap.html You would try increasing the memory size in stages up to maybe 30

Re: Indexation Speed?

2012-06-19 Thread Lance Norskog
Ah! Java memory size is a java command line option: http://javahowto.blogspot.com/2006/06/6-common-errors-in-setting-java-heap.html You would try increasing the memory size in stages up to maybe 300m. On Tue, Jun 19, 2012 at 2:04 AM, Bruno Mannina wrote: > > > Le 19/06/2012 10:51, Lance Norskog

What we loose if we use ClassicTokenizer instead of StandardTokenizer

2012-06-19 Thread Alok Bhandari
Hello, I need to know that if I use ClassicTokenizer instead of StandardTokenizer then what things I will loose. Is it the case that in future solr versions ClassicTokenizer will be deprecated? or development in ClassicTokenizer is going to halt? Please let me know this. -- View this message in c

Re: Indexation Speed?

2012-06-19 Thread Bruno Mannina
Le 19/06/2012 10:51, Lance Norskog a écrit : 675 doc/s is respectable for that server. You might move the memory allocated to Java up and down- there is a balance between amount of memory in Java v.s. the OS disk buffer. How can I do that ? is there an option during my command line or in a c

Re: StreamingUpdateSolrServer - Failure during indexing

2012-06-19 Thread Lance Norskog
When one document fails, the entire update fails, right? Is there now a mode where successful documents are added and failed docs are dropped? If you want to know if a document is in the index, search for it! There is no other guaranteed way. On Sun, Jun 17, 2012 at 3:14 PM, Jack Krupansky wrote

Re: Indexation Speed?

2012-06-19 Thread Lance Norskog
675 doc/s is respectable for that server. You might move the memory allocated to Java up and down- there is a balance between amount of memory in Java v.s. the OS disk buffer. And, of course, use the latest trunk. On Tue, Jun 19, 2012 at 12:10 AM, Bruno Mannina wrote: > Correction: file size is

Re: Indexation Speed?

2012-06-19 Thread Bruno Mannina
Correction: file size is 40 Mo !!! Le 19/06/2012 09:09, Bruno Mannina a écrit : Dear All, I would like to know if the indexation speed is right. I have a 40Go file size with around 27 000 docs inside. I index around 20 fields, My (old) test server is a DualCore 3.06GHz Intel Xeon with only 1G

Indexation Speed?

2012-06-19 Thread Bruno Mannina
Dear All, I would like to know if the indexation speed is right. I have a 40Go file size with around 27 000 docs inside. I index around 20 fields, My (old) test server is a DualCore 3.06GHz Intel Xeon with only 1Go Ram The file takes 40 seconds with the command line: java -jar post.jar myfile.