thanks for the information...
I did come across that discussion, I guess I will try to write a customized
Similarity class and disable tf.
I hope this is not totally odd to do ... I do notice about 10GB .frq file
size in cores that have total 10-30GB .fdt files. I wish the benchmark will
show me
Exist any similar approach that I could use in solr 3.6.1 or should I add this
logic to my application?
- Mensaje original -
De: "Upayavira"
Para: solr-user@lucene.apache.org
Enviados: Sábado, 15 de Diciembre 2012 12:37:11
Asunto: Re: Dedup component
Nope, it is a Solr 4.0 thing. In ord
I have changed to use dih.xx but still no luck. Even with dataimport or
dataimporter the query is able to fetch the delta records but they are not
able to commit to solr. Would there be any other reason why this would fail?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Nee
Hello,
I am using SolrCloud 4.0.0 and trying to get the Suggester to work. I have
set it up according to the wiki instructions but can't get it to return any
suggestions. Here is my setup:
*schema.xml*
*solrconfig.xml*
text_auto
suggest
org.apache.solr.spelling.sug
There is a /zookeeper servlet that the admin UI uses for the Cloud tab. I don't
know much about it, I think Ryan wrote it.
The other option is to talk to zk directly.
I also plan on adding an admin handler for ZooKeeper at some point.
- Mark
On Dec 15, 2012, at 12:33 PM, Luis Cappa Banda wrot
When your index is all cached by OS you won't see disk IO. Smaller heap,
smaller caches, more RAM.
Otis
--
Performance Monitoring - http://sematext.com/spm
On Dec 15, 2012 1:11 PM, "S L" wrote:
> My virtual machine has 6GB of RAM. Tomcat is currently configured to use
> 4GB
> of it. The size of
On Sat, Dec 15, 2012 at 1:11 PM, S L wrote:
> My virtual machine has 6GB of RAM. Tomcat is currently configured to use 4GB
> of it. The size of the index is 5.4GB for 3 million records which averages
> out to 1.8KB per record. I can look at trimming the data, having fewer
> records in the index to
p.s. Regarding streaming of the dat, my Java servlet uses solrj and iterates
through the results. Right now I'm focused on getting rid of the delay that
cause some queries to take 6 or 8 seconds to complete so I'm not even
looking at the performance of the streaming.
--
View this message in con
My virtual machine has 6GB of RAM. Tomcat is currently configured to use 4GB
of it. The size of the index is 5.4GB for 3 million records which averages
out to 1.8KB per record. I can look at trimming the data, having fewer
records in the index to make it smaller, or getting more memory for the VM.
Nope, it is a Solr 4.0 thing. In order for it to work, you need to store
every field, as what it does behind the scenes is retrieve the stored
fields, rebuilds the document, and then posts the whole document back.
Upayavira
On Sat, Dec 15, 2012, at 04:52 PM, Jorge Luis Betancourt Gonzalez wrote:
On Sat, Dec 15, 2012 at 12:04 PM, S L wrote:
> Thanks everyone for the responses.
>
> I did some more queries and watched disk activity with iostat. Sure enough,
> during some of the slow queries the disk was pegged at 100% (or more.)
>
> The requirement for the app I'm building is to be able to r
Thanks a lot, Per. Now I understand the whole scenario. One last question:
I've been searching trying to find some kind of request handler that
retrieves cluster status information, but no luck. I know that there exists
a JSON called clusterstate.json, but I don't know the way to get it in raw
JSON
I just did the experiment of retrieving only the metaDataUrl field. I still
sometimes get slow retrieval times. One query took 2.6 seconds of real time
to retrieve 80k of data. There were 500 results. QTime was 229. So, I do
need to track down where the extra 2+ seconds is going.
--
View this me
Thanks everyone for the responses.
I did some more queries and watched disk activity with iostat. Sure enough,
during some of the slow queries the disk was pegged at 100% (or more.)
The requirement for the app I'm building is to be able to retrieve 500
results in ideally one second. The index has
Is this updatable fields available in Solr 3.6.1, is the one I'm using right
now.
- Mensaje original -
De: "Upayavira"
Para: solr-user@lucene.apache.org
Enviados: Sábado, 15 de Diciembre 2012 7:56:45
Asunto: Re: Dedup component
Make the ID field out of the query text so you don't have t
Otis,
Can you give more details on this ? Sounds interesting to me. What about if
you are trying to re-order millions of Lucene documents ? Did you use
grouping first ?
Antoine.
On Thu, Dec 13, 2012 at 8:54 PM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:
> Hi,
>
> We've done something
Maybe we're at the stage of raising the issue of whether the significant
extra storage for time of day warrants a storage format that is optimized
for day only, call it TrieDay (or TrieDateTimeless.)
-- Jack Krupansky
-Original Message-
From: jmlucjav
Sent: Saturday, December 15, 201
Make the ID field out of the query text so you don't have to use the
dedup component, then use the updatable fields functionality in Solr
4.0:
$ curl http://localhost:8983/solr/update -H
'Content-type:application/json' -d '
[
{"id": "book1",
"copies_i" : { "inc" : 1},
"cat" : {
Luis Cappa Banda skrev:
Do you know if SolrCloud replica shards have 100% the same data as the
leader ones every time? Probably wen synchronizing with leaders there
exists a delay, so executing queries to replicas won't be a good idea.
As long as the replica is in state "active" it will be 100
Hello, Per.
Thanks for your answer! I jave worked a lot with SolrJ and in the last two
months also with the new SolrJ 4.0 and specifically with Zookeeper and
CloudSolrServer implementation. I've developed a search engine wrapper that
dispatches queries to SolrCloud using a CloudSolrServer pool. Th
without going through such rigorous testing, maybe for my case (interested
only in DAY), I could just index the trielong values such as 20121010,
20110101 etc...
This would take less space than trieDate (I guess), and I still have a date
looking number (for easier handling). I could even base the
As Mark mentioned Solr(Cloud) can be accessed through HTTP and return
e.g. JSON which should be easy to handle in a javascript. But the
client-part (SolrJ) of Solr is not just a dumb client interface - it
provides a lot of client-side functionality, e.g. some intelligent
decision making based o
22 matches
Mail list logo