On 6 July 2013 09:45, Ali, Saqib wrote:
> Thanks Jason! That was very helpful.
>
> I read on the solr wiki that:
> "Documents must have a unique key and the unique key must be stored
> (stored="true" in schema.xml)"
>
> What is this unique key? Is this just a id that we define in the schema.xml
>
Thanks Jason! That was very helpful.
I read on the solr wiki that:
"Documents must have a unique key and the unique key must be stored
(stored="true" in schema.xml)"
What is this unique key? Is this just a id that we define in the schema.xml
that is unique to all documents? We have something as f
Saqib:
At the simplest level:
1) Source the machine
2) Install Java
3) Install a servlet container of your choice
4) Copy your Solr WAR and conf directories as desired (probably a rough mirror
of your current single server)
5) Start it up and start sending data there
6) Query both by simpl
Hello Otis,
I was thinking more in terms of Solr DistributedSearch rather than
SolrCloud. I was hoping to add another Solr instance, when the time comes.
This is a low use application, but with lot of data. Uptime and query speed
are not of importance. However we would like to be able to index mor
Hi,
It's a broad question, but it starts with getting a few servers,
putting Solr 4.3.1 on it (soon 4.4), setting up Zookeeper, creating a
Solr Collection (index) with N shards and M replicas, and reindexing
your old data to this new cluster, which you can expand with new nodes
over time. If you
Question regarding the 2.1 billion+ document.
I understand that a single instance of solr has a limit of 2.1 billion
documents.
We currently have a single solr server. If we reach 2.1billion documents
limit, what is involved in moving to the Solr DistributedSearch?
Thanks! :)
Correct.
ES currently does not let you change the number of shards after you've
created an Index (Collection in SolrCloud).
It does not let you split shards either. SolrCloud has an advantage
over ES around this at this point.
Otis
--
Solr & ElasticSearch Support -- http://sematext.com/
Performan
According to the ElasticSearch glossary, “You cannot change the number of
primary shards in an index, once the index is created.” Really? Is that true?
(A “primary shard” is what Solr calls a shard, or slice.)
In other words, even though you can easily “add shards” on ES, those are really
just
Does https://issues.apache.org/jira/browse/SOLR-2112 help?
Otis
--
Solr & ElasticSearch Support -- http://sematext.com/
Performance Monitoring -- http://sematext.com/spm
On Fri, Jul 5, 2013 at 5:57 PM, Valery Giner wrote:
> As a simplest example, just write a query result into a file for proce
Furkan,
It's perfectly fine. Some people have small indices and lots of
queries, some have large indices and very few queries, and lucky ones
have very large indices and lots of queries at the same time.
We once helped a client take their indexing down from many hours to a
couple of minutes by u
And... is is based on Lucene/Solr?
-- Jack Krupansky
-Original Message-
From: Ali, Saqib
Sent: Friday, July 05, 2013 6:09 PM
To: solr-user@lucene.apache.org
Subject: Re: [Announcement] Norch- a search engine for node.js
Very interesting. What is the upper limit on the number of docume
Very interesting. What is the upper limit on the number of documents?
Thanks! :)
On Fri, Jul 5, 2013 at 11:53 AM, Fergus McDowall
wrote:
> Here is some news that might be of interest to users and implementers of
> Solr
>
>
> http://blog.comperiosearch.com/blog/2013/07/05/norch-a-search-engine-f
Ok, I know that it is really unnecessary to start a complex design. On the
other hand if your resources and needs are adequate and if you have a
bottleneck at your design it is really a fail not to plan a new design.
We have more than terabytes of data and we have dedicated some developers
at Hado
As a simplest example, just write a query result into a file for
processing by external programs (the programs are out of our control,
and the result could contain millions of docs)
Thanks,
Val
On 07/05/2013 04:41 PM, Walter Underwood wrote:
What are you doing that start=50 is normal? --
Hello all,
Can anyone please share a solrj example for distributed solr?
Thanks! :)
What are you doing that start=50 is normal? --wunder
On Jul 5, 2013, at 1:28 PM, Valery Giner wrote:
> Eric,
>
> We did not have any RAM problems, but just the following official limitation
> makes our life too miserable to use the shards:
>
> "Makes it more inefficient to use a high "sta
Eric,
We did not have any RAM problems, but just the following official
limitation makes our life too miserable to use the shards:
"Makes it more inefficient to use a high "start" parameter. For example,
if you request start=50&rows=25 on an index with 500,000+ docs per
shard, this will
Hi Ken,
Uh, I left this email until now hoping I could find you a reference to
similar reports, but I can't find them now. I am quite sure I saw
somebody with a similar report within the last month. Plus, several
people have reported issues with performance dropping when they went
from 3.x to 4.
And don't forget to test with sortable DocValues. I mean, sorting (and
faceting) was one of the main motivations for DocValues.
-- Jack Krupansky
-Original Message-
From: Otis Gospodnetic
Sent: Friday, July 05, 2013 3:42 PM
To: solr-user@lucene.apache.org
Subject: Re: Sorting
Hi Kowi
Hi Kowish,
Here is an easy way to find out:
1 use copyField to copy from string to tlong
2 use ab or JMeter to hammer Solr while sorting on one or the other
field (separate runs)
3 compare :)
Since you have SLAs, I'm assuming you already have 2 and 3 in place.
Otis
--
Solr & ElasticSearch Suppor
Here is some news that might be of interest to users and implementers of
Solr
http://blog.comperiosearch.com/blog/2013/07/05/norch-a-search-engine-for-node-js/
Norch (http://fergiemcdowall.github.io/norch/) is a search engine written
for Node.js. Norch uses the Node search-index module which is i
I don't want to sound negative, but I think it is a valid question to
consider - for the lack of information and certain mental rigidity may make
it sound bad - first of all, it is probably not for few gigabytes of data
and I can imagine that building indexes at the side when data lives is much
fas
Also considering using the SweetSpotSimilarityFactory class which allows to to
still engage normalization but control how intrusive it is. This, combined
with the ability to set a custom Similarity class on a per-fieldType basis may
be extremely useful.
More info:
http://lucene.apache.org/sol
Software developers are sometimes compensated based on the degree of
complexity that they deal with.
And managers are sometimes compensated based on the number of people they
manage, as well as the degree of complexity of what they manage.
And... training organizations can charge more and hav
Why is it better to require another large software system (Hadoop), when it
works fine without it?
That just sounds like more stuff to configure, misconfigure, and cause problems
with indexing.
wunder
On Jul 5, 2013, at 4:48 AM, Furkan KAMACI wrote:
> We are using Nutch to crawl web sites and
Since it works to fetch 10K rows and doesn't work to fetch 100K rows in a
single request, I very strongly suggest that you use the request that work.
Make ten requests of 10K rows each. Or even better, 100 requests of 1K rows
each.
Large requests make large memory demands.
wunder
On Jul 5, 20
The normal tomcat shutdown doesn't stop the server and take a long time, so i
do issue a kill -9 command. Any other suggestion to do this without the locking.
I would initiate a backup again and send the logs.
regards,
Ayush
> Date: Fri, 5 Jul 2013 19:40:12 +0530
> Subject: Re: Solr 4.3 Master/
SolrJ doesn't have explicit support for that param but you can always
add it yourself.
For example:
CoreAdminRequest.Unload req = new CoreAdminRequest.Unload(false);
((ModifiableSolrParams) req.getParams()).set("deleteInstanceDir", true);
req.process(server);
On Thu, Jul 4, 2013 at 12:50 PM, Lyub
Oops I actually meant to say that search engines *are not* optimized
for large pages. See https://issues.apache.org/jira/browse/SOLR-1726
Well one of the shards involved in the request is throwing an error.
Check the logs of your shards. You can also add a shards.info=true
param to your search whi
well our observation leads us that this happens only during spell check.
If we turn off the spell check we don't see this issue occurring at all from
our 24hrs test run.
We have Jboss5.1 in production running Solr 4.2.1 (without spellcheck) no
issues at all.
Aditya
--
View this message in co
Thanks for your answer,
I can fetch 10K documents without any issue. I don't think we are having out
of memory exception because each tomcat server in cluster has 8GB memory
allocated.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Invalid-version-expected-2-but-60-or-the-
Hi,
What should be faster: sorting by field of type string (solr.StrField) or
long (solr.TrieLongField).
In both cases values are numbers so I can decide what type of field to use.
Is it possible to speed up sorting by unique field? With sorting my queries
are 10-100 times slower and I can't meet
Okay so just for the rest of the people who dig up this thread. You
had to put all the extra jar files required by typo3 into WEB-INF/lib
to make this work. Is that right?
On Fri, Jul 5, 2013 at 8:03 PM, Michael Bakonyi
wrote:
> Hi Shalin,
>
> Am 05.07.2013 um 16:23 schrieb Shalin Shekhar Mangar:
Can you try to fetch a smaller number of documents? Search engines are
optimized for returning large pages. My guess is that one of the
shards is returning an error (maybe an OutOfMemoryError) for this
query.
On Fri, Jul 5, 2013 at 7:56 PM, eakarsu wrote:
> I am using Solr 4.3.1 on solrcloud with
Hi Shalin,
Am 05.07.2013 um 16:23 schrieb Shalin Shekhar Mangar:
> There are plenty of use-cases for having multiple cores. You may have
> two different schemas for two different kind of documents. Perhaps you
> are indexing content in multiple languages and you may want a core per
> language. In
Hi Giovanni,
damn, you were right! I would have never hit on that!
Indeed I copied a jar into that dir as in one post I found somebody recommended
that.
Thx a lot for your help, now I have a look at the next error which appears ;)
Cheers,
Michael
Am 05.07.2013 um 15:25 schrieb Giovanni Bri
I am using Solr 4.3.1 on solrcloud with 10 nodes.
I added 3 million documents from a csv file with this command
curl
'http://localhost:8080/solr/trcollection2/update/csv?stream.file=/home/hduser/csvFile.csv&skipLines=1&fieldnames=,cache,segment,digest,tstamp,lang,url,,content,id,title,boost&stre
On Thu, Jul 4, 2013 at 4:32 PM, Michael Bakonyi
wrote:
> Hi everyone,
>
> I'm trying to get the CMS "TYPO3" connected with Solr 3.6.2.
>
> By now I followed the installation at http://wiki.apache.org/solr/SolrTomcat
> except that I didn't copy the .war-file into the $SOLR_HOME but referencing
>
The current implementation doesn't sort strictly on hit-counts. Rather it
gives you collations that have corrections with thenearest distance from the
original terms.
Sorting on query result score sounds like an interesting and do-able
alternative, although not supported currently. The cave
On Fri, Jul 5, 2013 at 6:14 PM, Cool Techi wrote:
>
> 1) That was my initial suspicion, but when I run ps -aux | grep "java", but
> there it doesn't show any other program running. I kill the process and start
> again and it locks.
How are you killing the process? A SIGKILL will leave a lock fi
I saw something similar when I placed some jar in tomcat/lib (data import
handler), the right place was instead WEB-INF/lib.
I would try placing al needed jars there.
2013/7/5 Michael Bakonyi
> Hm, can't anybody help me out? I still can't get my installation run
> correctly ...
>
> What I've fo
>>Is there a way to omitNorms and still be able to use {!boost b=boost} ?
OR you could let /omitNorms="false"/ as usual and have your custom
Similarity implementation with the length normalization method overridden
for using a constant value of 1.
Regards
Pravesh
--
View this message in con
Thanks Jeroen and Upayavira!
I read the warning about losing the ability to use index time boosts when I
disable length normalization. And we actually use it; at least if it means
having a boost field in the index and doing queries like this:
"{!boost b=boost}( series:RCWP^10 OR otherFileds:que
1) That was my initial suspicion, but when I run ps -aux | grep "java", but
there it doesn't show any other program running. I kill the process and start
again and it locks.
2) When we fire backup on Slave, the whole core hangs after a while and also
replication stops. This was not happening w
Hm, can't anybody help me out? I still can't get my installation run correctly
...
What I've found out recently – if I understand it aright:
SolrInfoMBean has somehow to do with JMX. So I manually activated JMX via
inserting within my solrconfig.xml as described here:
http://wiki.apache.org/
We are using Nutch to crawl web sites and it stores documents at Hbase.
Nutch uses Solrj to send documents to be indexed. We have Hadoop at our
ecosystem as well. I think that there should be an implementation at Solrj
that sends documents (via CloudSolrServer or something like that) as
MapReduce j
This can mean multiple things:
1. You had killed a solr process earlier which left the lock file in place
2. You have more than one Solr core pointing to the same data directory
3. A solr process is already running and you are trying to start
another one with the same config.
On Fri, Jul 5, 2013 a
We have set up solr 4.3 with master/setup and are facing a couple of issues,
Index locking, the index on slave hangs at time and when we restart the core
the core get's locked up. I have checked the logs and there are no OOM error or
anything else other than the error given below,Caused by:
or
48 matches
Mail list logo