Re: SolrCloud leader

2013-07-13 Thread Radim Kolar
Since SolrCloud is a master-free architecture, you can send both queries 
and updates to ANY node and SolrCloud will assure that the data gets to 
where it belongs


its way faster to send them to right node


Re: throttle segment merging

2012-10-29 Thread Radim Kolar

Dne 29.10.2012 12:18, Michael McCandless napsal(a):

With Lucene 4.0, FSDirectory now supports merge bytes/sec throttling
(FSDirectory.setMaxMergeWriteMBPerSec): it rate limits that max
bytes/sec load on the IO system due to merging.

Not sure if it's been exposed in Solr / ElasticSearch yet ...
its not available in solr. Also solr class hierarchy for directory 
providers is bit different from lucene. In solr, MMAP DF and NIOFSDF 
needs to be subclass of StandardDF. then add write limit property to 
standardDF and it will be inherited by others like in lucene.


solr
http://lucene.apache.org/solr/4_0_0/solr-core/org/apache/solr/core/CachingDirectoryFactory.html
lucene
http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/store/FSDirectory.html


Re: throttle segment merging

2012-10-29 Thread Radim Kolar
is there JIRA ticket dedicated to throttling segment merge? i could not 
find any, but jira search kinda sucks.


It should be ported from ES because its not much code.


Re: throttle segment merging

2012-10-29 Thread Radim Kolar

Dne 29.10.2012 0:09, Lance Norskog napsal(a):

1) Do you use compound files (CFS)? This adds a lot of overhead to merging.

i do not know. whats solr configuration statement for turning them on/off?

2) Does ES use the same merge policy code as Solr?

ES rate limiting:

http://www.elasticsearch.org/guide/reference/index-modules/store.html
https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/indices/store/IndicesStore.java
http://rajish.github.com/api/elasticsearch/0.20.0.Beta1-SNAPSHOT/org/apache/lucene/store/StoreRateLimiting.html




Re: throttle segment merging

2012-10-27 Thread Radim Kolar

Dne 26.10.2012 3:47, Tomás Fernández Löbbe napsal(a):

Is there way to set-up logging to output something when segment merging
runs?


I think segment merging is logged when you enable infoStream logging (you
should see it commented in the solrconfig.xml)
no, segment merging is not logged at info level. it needs customized log 
config.





Can be segment merges throttled?
> You can change when and how segments are merged with the merge 
policy, maybe it's enough for you changing the initial settings 
(mergeFactor for example)?


I am now researching elasticsearch, it can do it, its lucene 3.6 based


throttle segment merging

2012-10-25 Thread Radim Kolar
I have problems with very low indexing speed as soon as core size grows 
over 15 GB. I suspect that it can be due io intensive segment merging.


Is there way to set-up logging to output something when segment merging 
runs?


Can be segment merges throttled?


one field type extending another

2012-10-25 Thread Radim Kolar
can i do something like this: (fails with fieldType: missing mandatory 
attribute 'class')
 termVectors="true" termPositions="true" termOffsets="true"/>


one field type will extending another type to save copy and paste.


Re: solr 4.1 compression

2012-10-24 Thread Radim Kolar

i found this ticket: https://issues.apache.org/jira/browse/SOLR-3927
compression is currently lucene 4.1-branch only and not yet in solr4.1 
branch?


solr 4.1 compression

2012-10-22 Thread Radim Kolar
can someone provide example configuration how to use new compression in 
solr 4.1?


http://blog.jpountz.net/post/33247161884/efficient-compressed-stored-fields-with-lucene 



Re: What does _version_ field used for?

2012-10-18 Thread Radim Kolar
> You would only *need* the field if updateLog is turned on in your 
updateHandler.


Solrcloud do not needs this?


Re: add shard to index

2012-10-15 Thread Radim Kolar



Can you share more please?

i do not know how exactly is formula for calculating ratio.

if you have something like: (term count in shard 1 + term count in shard 
2) / num documents in all shards


then just use shard size as weight while computing this:

(term count in shard 1 * shard1 keyspace size + term count in shard 2 * 
shard2 keyspace size) / (num documents in all shards * all shards 
keyspace size)


Re: add shard to index

2012-10-12 Thread Radim Kolar

Dne 11.10.2012 1:12, Upayavira napsal(a):

That is what is being discussed already. The thing is, at present, Solr
requires an even distribution of documents across shards, so you can't
just add another shard, assign it to a hash range, and be done with it.

You can use shard size as part of scoring mechanism.


Re: add shard to index

2012-10-08 Thread Radim Kolar
Do it as it is done in cassandra database. Adding new node and 
redistributing data can be done in live system without problem it looks 
like this:


every cassandra node has key range assigned. instead of assigning keys 
to nodes like hash(key) mod nodes, then every node has its portion of 
hash keyspace. They do not need to be same, some node can have larger 
portion of keyspace then another.


hash function max possible value is 12.

shard1 - 1-4
shard2 - 5-8
shard3 - 9-12

now lets add new shard. In cassandra adding new shard by default cuts 
existing one by half, so you will have

shard1 - 1-2
shard23-4
shard35-8
shard4   9-12

see? You needed to move only documents from old shard1. Usually you are 
adding more then 1 shard during reorganization, you do not need to 
rebalance cluster by moving every node into different position in hash 
keyspace that much.


add shard to index

2012-10-07 Thread Radim Kolar
i am reading this: http://wiki.apache.org/solr/SolrCloud section 
Re-sizing a Cluster


Its possible to add shard to an existing index? I do not need to get 
data redistributed, they can stay where they are, its enough for me if 
new entries will be distributed into new number of shards. restarting 
solr is fine.


Re: solr binary protocol

2012-10-03 Thread Radim Kolar



Have you looked javabin?
http://wiki.apache.org/solr/javabin

i checked that page:
"binary format used to write out solr's response"

so, its for solr response only, not for request sent to solr.


solr binary protocol

2012-09-26 Thread Radim Kolar
Its possible to use SOLR binary protocol instead of xml for taking TO 
SOLR? I know that it can be used in Solr reply.


Re: solrcloud without realtime

2012-09-24 Thread Radim Kolar

Dne 24.9.2012 14:05, Erick Erickson napsal(a):

I'm pretty sure all you need to do is disable autoSoftCommit. Or rather
don't un-comment it from solrconfig.xml
and what about solr.NRTCachingDirectoryFactory? Is 
solr.MMapDirectoryFactory faster if there is no NRT search requirements?


solrcloud without realtime

2012-09-24 Thread Radim Kolar
its possible to use solrcloud but without real-time features? In my 
application I do not need realtime features and old style processing 
should be more efficient.


Re: solr4.0 compile: Unable to create javax script engine for javascript

2012-09-22 Thread Radim Kolar



I believe the problem is not that you need BSF -- because you are using
java6 (you have to to compile Solr) so BSF isn't needed.  The problem (i
think) is that FreeBSD's JDK doesn't include javascript by default - i
believe you just need to install the rhino "js.jar"

You are right. FreeBSD OpenJDK do not include Javascript.

I finally got it work, commented at 
https://issues.apache.org/jira/browse/LUCENE-4415


solr4.0 compile: Unable to create javax script engine for javascript

2012-09-21 Thread Radim Kolar

I cant compile SOLR 4.0, but i can compile trunk fine.

ant create-package fails with:

BUILD FAILED
/usr/local/jboss/.jenkins/jobs/Solr/workspace/solr/common-build.xml:229: 
The following error occurred while executing this line:
/usr/local/jboss/.jenkins/jobs/Solr/workspace/solr/common-build.xml:279: 
Unable to create javax script engine for javascript


Total time: 1 minute 19 seconds


I tried google and discovered this: 
http://stackoverflow.com/questions/9663801/freebsd-ant-javax-script-engine


Documentation for ant - https://ant.apache.org/manual/Tasks/script.html 
it requires some external library. I placed bsf-all-3.1.jar into ant lib 
directory but it still fails.


Can you please update document 
http://wiki.apache.org/solr/HowToCompileSolr?highlight=%28compile%29 
with information what libraries are needed?


Re: [Solr4 beta] error 503 on commit

2012-09-12 Thread Radim Kolar



could not be solr able to close oldest warming searcher and replace it by
new one?

That approach can easily lead to starvation (i.e. you never get a new
searcher usable for queries).

It will not. If there is more then 1 warming searcher. Look at this schema:

1. current in use searcher
2. 1st warming searcher
3. 2nd warming searcher

if new warming searcher is needed, close (3) and create a new one (3).
(2) will finish work uninterrupted and it will replace (1)


Re: [Solr4 beta] error 503 on commit

2012-09-12 Thread Radim Kolar
> After investigating more, here is the tomcat log herebelow. It is 
indeed the same problem: "exceeded limit of maxWarmingSearchers=2,".


could not be solr able to close oldest warming searcher and replace it 
by new one?


Re: cant build trunk

2012-09-10 Thread Radim Kolar
Anyway, I've removed the empty @return tags in the files mentioned in 
the below report, in trunk r1382976.


it builds fine now.


Re: cant build trunk

2012-09-10 Thread Radim Kolar

Dne 10.9.2012 18:28, Jack Krupansky napsal(a):

How are you compiling trunk?

Jenkins - ant task - arguments:
clean compile dist create-package package-local-src-tgz


cant build trunk

2012-09-10 Thread Radim Kolar

Trunk do not compiles due to javadoc warnings. Should i create ticket and 
submit patch or these trivial errors are fixed quickly?

 [javadoc] Building tree for all the packages and classes...
  [javadoc] 
/usr/local/jboss/.jenkins/jobs/Solr/workspace/solr/core/src/java/org/apache/solr/cloud/LeaderElector.java:168:
 warning - @return tag has no arguments.
  [javadoc] 
/usr/local/jboss/.jenkins/jobs/Solr/workspace/solr/core/src/java/org/apache/solr/cloud/LeaderElector.java:198:
 warning - @return tag has no arguments.
  [javadoc] 
/usr/local/jboss/.jenkins/jobs/Solr/workspace/solr/core/src/java/org/apache/solr/cloud/ZkController.java:667:
 warning - @return tag has no arguments.
  [javadoc] 
/usr/local/jboss/.jenkins/jobs/Solr/workspace/solr/core/src/java/org/apache/solr/core/CachingDirectoryFactory.java:255:
 warning - @return tag has no arguments.
  [javadoc] 
/usr/local/jboss/.jenkins/jobs/Solr/workspace/solr/core/src/java/org/apache/solr/spelling/PossibilityIterator.java:183:
 warning - @return tag has no arguments.
  [javadoc] Generating 
/usr/local/jboss/.jenkins/jobs/Solr/workspace/solr/build/docs/api/org/apache/solr/uima/processor/exception/FieldMappingException.html...
  [javadoc] Copying file 
/usr/local/jboss/.jenkins/jobs/Solr/workspace/solr/core/src/java/doc-files/tutorial.html
 to directory 
/usr/local/jboss/.jenkins/jobs/Solr/workspace/solr/build/docs/api/doc-files...
  [javadoc] Generating 
/usr/local/jboss/.jenkins/jobs/Solr/workspace/solr/build/docs/api/org/apache/solr/util/package-summary.html...
  [javadoc] Copying file 
/usr/local/jboss/.jenkins/jobs/Solr/workspace/solr/core/src/java/org/apache/solr/util/doc-files/min-should-match.html
 to directory 
/usr/local/jboss/.jenkins/jobs/Solr/workspace/solr/build/docs/api/org/apache/solr/util/doc-files...
  [javadoc] Building index for all the packages and classes...
  [javadoc] Building index for all classes...
  [javadoc] Generating 
/usr/local/jboss/.jenkins/jobs/Solr/workspace/solr/build/docs/api/help-doc.html...
  [javadoc] 5 warnings

BUILD FAILED
/usr/local/jboss/.jenkins/jobs/Solr/workspace/solr/build.xml:527: The following 
error occurred while executing this line:
/usr/local/jboss/.jenkins/jobs/Solr/workspace/solr/common-build.xml:223: The 
following error occurred while executing this line:
/usr/local/jboss/.jenkins/jobs/Solr/workspace/lucene/common-build.xml:1552: 
Javadocs warnings were found!



solr trademark

2012-07-25 Thread Radim Kolar

Mark,

You are certainly not using the Solr mark in an approved manner and I'd hope if 
you are going to take advantage of our mailing list for promotion of your 
product, that you would not violate our trademark.
Apache Foundation do not own SOLR (R) trademark. I looked into registry 
(USA and World) and none of them shows apache foundation as owner of 
SOLR trademark.


Trademarks registered to apache foundations in USA are:
HADOOP, OPENOFFICE.ORG, SUBVERSION

thats all. In world registry, you have nothing at all. Rest of 
trademarks from http://apache.org/foundation/marks/list/ are 
home-invented ones. Its almost impossible to win in court if you do not 
have your trademark registered.


Even "Apache" is not your trademark anymore. I guess you guys failed to 
renew.

*Word Mark * *APACHE*
*Serial Number* 85100058
*Abandonment Date* May 23, 2011
*Owner* The *Apache* Software *Foundation* CORPORATION DELAWARE 1901 
Munsey Drive Forest Hill MARYLAND 210502747


> Solr is our trademark and third party products must have their own name.
Solr trademark is owned and properly registered by somebody else. Its 
YOU who is violating registered SOLR trademark.


Re: [Announce] Solr 3.6 with RankingAlgorithm 1.4.2 - NRT support

2012-05-27 Thread Radim Kolar




What reference page are you referring to?

http://tgels.com/wiki/en/Sites_using/downloaded_RankingAlgorithm_or_Solr-RA



Re: [Announce] Solr 3.6 with RankingAlgorithm 1.4.2 - NRT support

2012-05-27 Thread Radim Kolar
My company is thinking to buy search algorithm from famous expert in 
searching Petr Hejl - http://www.milionovastranka.net/
but i see RankingAlgorithm has fantastic results too and looking at its 
reference page it even powers sites like oracle.com and ebay.com.


Re: Solritas in production

2012-05-06 Thread Radim Kolar

Dne 6.5.2012 14:02, András Bártházi napsal(a):

We're currently evaluating Solr as a Sphinx replacement. The
development is almost done, and it seems to be working fine

why you want to replace sphinx with solr?


Re: Benchmark Solr vs Elastic Search vs Sensei

2012-04-27 Thread Radim Kolar

Dne 27.4.2012 19:59, Jeremy Taylor napsal(a):

DataStax offers a Solr integration that isn't master/slave and is
NearRealTimes.

its rebranded solandra?


custom org.apache.lucene.store.Directory

2012-04-14 Thread Radim Kolar
is custom /org.apache.lucene.store.Directory ///supported in Solr? I 
want to try infinispan.

//