Figured it out; Added a dependency for the dataimporthandler in my pom file.
On May 15, 2013, at 11:59 PM, PeriS wrote:
> After I turned on the logging, I found the following stack trace:
>
> Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler'
>
> Not sure why the embed
There is Lucene faceting module, which doesn't do anything in common with
Solr, but it looks like it has something what you are looking for.
http://shaierera.blogspot.com/2012/11/lucene-facets-part-1.html
On Thu, May 16, 2013 at 1:33 AM, Jan Morlock wrote:
> Hi,
>
> we are using faceted search f
After I turned on the logging, I found the following stack trace:
Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler'
Not sure why the embeddedsolrserver is looking for it...
On May 15, 2013, at 7:26 PM, PeriS wrote:
> Although now its complaining about even though i hav
Im trying to find out which routing algorithm (implicit/composite id) is being
used in my cluster. We are running solr 4.1. I was expecting to see it in my
clusterState (based on a previous thread that someone else posted) but I don't
see it there. Could someone please help?
Thanks!
Santoash
: In Solr 1.4, on slave, I supplied a masterUrl, but did NOT supply any
: pollInterval at all on slave. I did NOT supply an "enable"
: "false" in slave, because I think that would have prevented even manual
: replication.
that exact same config should still work with solr 4.3
: This seemed to
"group.order" is not a valid parameter.
You're probably looking for "group.sort"
-Yonik
http://lucidworks.com
On Wed, May 15, 2013 at 9:30 PM, alexzhang wrote:
> I use the Solr 4.0.0 +, when I try to sort the results which within one
> group, it does not work?
>
> The wiki I referenced: http://
I use the Solr 4.0.0 +, when I try to sort the results which within one
group, it does not work?
The wiki I referenced: http://wiki.apache.org/solr/FieldCollapsing
The xml return like:
..
2
AEST
Thanks for that info. So besides the two that I have already seen, are
there any more ways that the index directory can be named? I am working on
some home-grown administration scripts which need to know the name of the
index directory.
Bill
On Wed, May 15, 2013 at 7:13 PM, Mark Miller wrote:
You can't just hit the same handler twice? What about two different
handlers and pass the same config file via URL parameter?
Where does it make it single-threaded?
Regards,
Alex.
On 15 May 2013 19:18, "Shawn Heisey" wrote:
> On 5/15/2013 2:52 PM, Furkan KAMACI wrote:
>
>> You said "If I w
Although now its complaining about even though i have provided the correct core
name.
org.apache.solr.common.SolrException: No such core
On May 15, 2013, at 7:21 PM, PeriS wrote:
> Actually fixed it. So by accident i was using sole-core 3.x version. Once I
> upgraded the version of solar-cor
Actually fixed it. So by accident i was using sole-core 3.x version. Once I
upgraded the version of solar-core to 4.x it got resolved.
Thanks
-Peri.S
On May 15, 2013, at 7:14 PM, Shawn Heisey wrote:
> On 5/15/2013 5:02 PM, PeriS wrote:
>>> I m trying to use EmbeddedSolrServer but when tryin
On 5/15/2013 2:52 PM, Furkan KAMACI wrote:
You said "If I were doing this with the dataimport handler, I would define
more than one handler in solrconfig.xml, each with its own config file."
What is the benefit of using more than one handler?
DIH is single-threaded. By using more than one hand
On 5/15/2013 5:02 PM, PeriS wrote:
I m trying to use EmbeddedSolrServer but when trying to initialize the
coreContainer, get the following error;
java.lang.NoClassDefFoundError: org/apache/solr/common/ResourceLoader
Did you only use the solrj jar and the jars in solrj-libs? This is
enough fo
It's fairly meaningless from a user perspective, but it happens when an index
is replicated that cannot be simply merged with the existing index files and
needs a new directory.
- Mark
On May 15, 2013, at 5:38 PM, Bill Au wrote:
> I am running 2 separate 4.3 SolrCloud clusters. On one of the
>
>
> I m trying to use EmbeddedSolrServer but when trying to initialize the
> coreContainer, get the following error;
> java.lang.NoClassDefFoundError: org/apache/solr/common/ResourceLoader
>
> Any ideas please?
>
> -Peri.S
>
*** DISCLAIMER *** This is a PRIVATE message. If you are not th
Yeah, I keep forgetting that "min should match" for a BooleanQuery defaults
to 1 only if there are no required terms, but if there are required terms it
defaults to 0.
-- Jack Krupansky
-Original Message-
From: Chris Hostetter
Sent: Wednesday, May 15, 2013 1:28 PM
To: solr-user@lucen
On Wed, May 15, 2013 at 5:06 PM, Steven Bower wrote:
> This leads me to believe that the
> TransactionLog is not properly closing all of it's files before getting rid
> of the object...
I tried some ad hoc tests, and I can't reproduce this behavior yet.
There must be some other code path that inc
They are visible to ls...
On Wed, May 15, 2013 at 5:49 PM, Yonik Seeley wrote:
> On Wed, May 15, 2013 at 5:20 PM, Steven Bower wrote:
> > when the TransactionLog objects are dereferenced
> > their RandomAccessFile object is not closed..
>
> Have the files been deleted (unlinked from the direct
There seem to be quite a few places where the RecentUpdates class is used
but is not properly created/closed throughout the code...
For example in RecoveryStrategy it does this correctly:
UpdateLog.RecentUpdates recentUpdates = null;
try {
recentUpdates = ulog.getRecentUpdates();
You can disable polling so that the slave never polls the Master(In Solr
4.3 you can disable it from the Admin interface). . And you can trigger a
replication using the HTTP API
http://wiki.apache.org/solr/SolrReplication#HTTP_API or again, use the
Admin interface to trigger a manual replication.
On Wed, May 15, 2013 at 5:20 PM, Steven Bower wrote:
> when the TransactionLog objects are dereferenced
> their RandomAccessFile object is not closed..
Have the files been deleted (unlinked from the directory), or are they
still visible via "ls"?
-Yonik
http://lucidworks.com
I am running 2 separate 4.3 SolrCloud clusters. On one of them I noticed
the file data/index.properties on the replica nodes where the index
directory is named "index.".
On the other cluster, the index directory is just named "index".
Under what condition is index.properties created? I am tryin
Hi,
we are using faceted search for our queries. However neither sorting by
count nor sorting by index as described in [1] is suitable for our business
case. Instead, we would like to have the facets (or at least the beginning
of them) sorted by the score of the top document possessing the
corresp
Maybe we need a flag in the update handler to ignore commit requests.
I just enabled a similar thing for our JVM, because something, somewhere was
calling System.gc(). You can completely ignore explicit GC calls or you can
turn them into requests for a concurrent GC.
A similar setting for Solr
On Wed, May 15, 2013 at 5:20 PM, Steven Bower wrote:
> I'm hunting through the UpdateHandler code to try and find where this
> happens now..
UpdateLog.addOldLog()
-Yonik
http://lucidworks.com
Most definetly understand the don't commit after each record...
unfortunately the data is being fed by another team which I cannot
control...
Limiting the number of potential tlog files is good but I think there is
also an issue in that when the TransactionLog objects are dereferenced
their Random
Shawn,
Sorry I did not acknowledge the additional information you provided.
I'd have to go back and re-examine all of the 3.5 settings again as we had to
much with them somewhat to get 4.2.1 to work. Q.alt was a bit trickey, I have
to review our notes on that.
I solved the problem of the mis
Hmmm, we keep open a number of tlog files based on the number of
records in each file (so we always have a certain amount of history),
but IIRC, the number of tlog files is also capped. Perhaps there is a
bug when the limit to tlog files is reached (as opposed to the number
of documents in the tlo
We have a system in which a client is sending 1 record at a time (via REST)
followed by a commit. This has produced ~65k tlog files and the JVM has run
out of file descriptors... I grabbed a heap dump from the JVM and I can see
~52k "unreachable" FileDescriptors... This leads me to believe that the
Hi Shawn;
You said "If I were doing this with the dataimport handler, I would define
more than one handler in solrconfig.xml, each with its own config file."
What is the benefit of using more than one handler?
2013/5/15 Shawn Heisey
> > The data is pulled from the MSSQL database.
> > I think th
I want to set up Solr replication between a master and slave, where no
automatic polling every X minutes happens, instead the slave only
replicates on command. [1]
So the basic question is: What's the best way to do that? But I'll
provide what I've been doing etc., for anyone interested.
Unt
You'll have to consult with the ManifoldCF project on exactly what
parameters they send SolrCell, but here's a raw SolrCell example:
curl "http://localhost:8983/solr/update/extract?literal.id=doc-1\
&commit=true&uprefix=attr_" -F "myfile=@HelloWorld.docx"
Query response:
...
The Solr "stats" search component does some basic aggregates: min, max,
count, sum, average, mean, sum of squares, standard deviation:
http://wiki.apache.org/solr/StatsComponent
-- Jack Krupansky
-Original Message-
From: eShard
Sent: Wednesday, May 15, 2013 2:45 PM
To: solr-user@luce
Shawn Heisey [s...@elyograg.org]:
> Performance testing would be required in order to make a proper
> determination on whether SSD makes financial sense.
I fully agree.
[Lack of TRIM with RAID]
> then performance eventually suffers, and can become even worse than
> a spinning hard disk.
Do you
Good afternoon,
Does anyone know of a good tutorial on how to perform SQL like aggregation
in solr queries?
Thanks,
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-aggregate-data-in-solr-4-0-tp4063584.html
Sent from the Solr - User mailing list archive at Nabble.com
Good afternoon,
I'm using solr 4.0 final with manifoldcf v1.2dev on tomcat 7.0.34
today, a user asked a great question. What if I only know the name of the
folder that the documents are in?
Can I just search on the folder name?
Currently, I'm only indexing documents; how do I capture the folder nam
: +Java +mysql +php TCL Perl Selenium -ethernet -switching -routing
that's missing one of the started requirements...
: 2. Atleast one keyword out of* TCL Perl Selenium* should be present
...should be...
+Java +mysql +php +(TCL Perl Selenium) -ethernet -switching -routing
-Hoss
: Subject: Hierarchical Faceting
: References:
: <15062_1368600769_zzi0n0aykpk6h.00_519330be.7000...@uni-bielefeld.de>
: In-Reply-To:
: <15062_1368600769_zzi0n0aykpk6h.00_519330be.7000...@uni-bielefeld.de>
https://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists
Hi Mark,
Yes, I am using reload. Here is the jira that I filed.
https://issues.apache.org/jira/browse/SOLR-4805
Please let me know if there is any additional data that you need.
On Wed, May 15, 2013 at 12:53 PM, Mark Miller wrote:
>
> On May 15, 2013, at 12:26 PM, Jared Rodriguez
> wrote:
: After some research the following syntax worked
: start_time_utc_epoch:[1970-01-01T00:00:00Z TO
: _val_:"merchant_end_of_day_in_utc_epoch"])
that syntax definitely does not work ... i don't know if there is a typo
in your mail, or if you are just getting strange results that happen to
look li
They need to be similar enough to satisfy the particular queries.
- Mark
On May 15, 2013, at 12:23 PM, Marcin wrote:
> Hi there,
>
> I am trying to figure out what SOLR means by compatible collection in order
> to be able to run the following query:
>
> |Query all shards of multiple compatib
Can't you use the PathHierarchyTokenizerFactory mentioned on that page?
I think it is called descendent-path in the default schema. Won't that
get you what you want?
UK/London/Covent Garden
becomes
UK
UK/London
UK/London/Covent Garden
and
India/Maharastra/Pune/Dapodi
becomes
India
Hi there,
I am trying to figure out what SOLR means by compatible collection in
order to be able to run the following query:
|Query all shards of multiple compatible collections, explicitly specified:|
|http://localhost:8983/solr/collection1/select?collection=collection1_NY,collection1_NJ,col
On May 15, 2013, at 12:26 PM, Jared Rodriguez wrote:
> the cores in the collection stay offline even if there are no
> material changes.
I've used reload - if you are having trouble with it, please post more details
or file a JIRA issue.
- Mark
Thank you very much for your answers!!
As I said earlier, I am new in Solr and I would like to know if there is any
way to get the list of terms in the document with that ID.
I found that I can add the parameters facet = true and facet.field = TITLE
to return all terms ordered by frequency rather.
> The data is pulled from the MSSQL database.
> I think the bottleneck for indexing in SOLR.
> Is it possible to further boost by kettle?
I don't know what kettle is or what its capabilities are.
Can you run more than one instance of kettle at the same time, each one
retrieving part of the databa
Seems fast to me, too. We get about 600/second pulling data from MySQL with a
pretty complicated query.
Check the CPU usage on the Solr machine. If that is not reaching 100% for
periods of time, then Solr is not the bottleneck. Indexing is very
CPU-intensive.
On a multi-CPU Solr machine, you w
On 15 May 2013 21:44, horot wrote:
> Hi, Gora!
>
> The data is pulled from the MSSQL database.
> I think the bottleneck for indexing in SOLR.
Why do you think so? Have you checked the CPU/memory
usage on the Solr server? Likewise for the database
server?
Also, I had somehow glossed over your num
"3500-5000 records per second. This is a very small speed."
That's hardly a slow rate for ingestion of data!
Who is telling you that it is?
That is not to say that the speed can't be improved, but let's keep things
in perspective.
And of course the speed does depend on your schema and actual
I have used both and they seem to work well for basic operations - create,
delete, etc. Although newer operations like reload do not function as they
should - the cores in the collection stay offline even if there are no
material changes.
On Wed, May 15, 2013 at 6:53 AM, A.Eibner wrote:
> Hi,
Hi, Gora!
The data is pulled from the MSSQL database.
I think the bottleneck for indexing in SOLR.
Is it possible to further boost by kettle?
--
View this message in context:
http://lucene.472066.n3.nabble.com/how-to-increase-upload-into-Solr-4-x-tp4063451p4063540.html
Sent from the Solr - Use
On 15 May 2013 15:36, horot wrote:
> Hi,
>
> I use to upload data with Pentahoo Kettle into Solr. The average speed is
> 3500-5000 records per second.
> This is a very small speed. Is there a quick tool that would give the
> highest speed, or it depends on the Solr?
First, you would need to figur
Hi,
I use to upload data with Pentahoo Kettle into Solr. The average speed is
3500-5000 records per second.
This is a very small speed. Is there a quick tool that would give the
highest speed, or it depends on the Solr?
--
View this message in context:
http://lucene.472066.n3.nabble.com/how-to
On 5/15/2013 3:56 AM, pankaj.pand...@wipro.com wrote:
> Thanks Shawn for explaining everything in such detail, it was really helpful.
>
> Have few more queries on the same. Can you please explain the purpose of the
> 3rd box in minimal configuration, with the standalone zookeeper?
A zookeeper e
On 5/15/2013 4:53 AM, A.Eibner wrote:
> I just wanted to ask, if anyone is using the collections API to create
> collections,
> or if not how they use the coreAPI to create a collection with
> replication ?
For my little SolrCloud install using 4.2.1, I have used the collections
API exclusively.
> -Original Message-
> From: Keith Naas [mailto:keithn...@dswinc.com]
> Sent: Tuesday, May 14, 2013 3:31 PM
> Stepping through the code on a live instance we can see the cache being
> "disabled" by the destroy calls after each root doc. This destruction causes
> EntityProcessorBase to ch
1. Create a schema that accomodates both types of fields either using
optional fields or dynamic fields.
2. Create some sort of differentiator key (e.g. schema), separately
from id (which needs to be globally unique, so possibly schema+id)
3. Use that schema in filter queries (fq) to look only at s
On 5/15/2013 8:49 AM, vrparekh wrote:
> I have two different solr servers. Both server has different schema.
>
> Is it possible to shard these two solr server?
>
> Or is there any other way to combine/merge results of two different solr
> servers?
In general, this won't work. If your two schema
Hello All,
I have two different solr servers. Both server has different schema.
Is it possible to shard these two solr server?
Or is there any other way to combine/merge results of two different solr
servers?
--
View this message in context:
http://lucene.472066.n3.nabble.com/sharding-betw
You cannot currently adjust the number of replicas with the collections api -
you have to use the core admin api. Which means you determine the replica
placement based on what server you hit with the core admin api.
http://wiki.apache.org/solr/SolrCloud#Creating_cores_via_CoreAdmin
Create 2 mor
I'd use Jetty for SolrCloud - much, much, much better tested.
Here is a note on something similar around tomcat:
http://stackoverflow.com/questions/10570672/get-nohttpresponseexception-for-load-testing
Perhaps that helps, perhaps not.
The root cause is: org.apache.http.NoHttpResponseException:
On 5/15/2013 12:52 AM, Bernd Fehling wrote:
> while I can't get solr 4.3 with run-jetty-run up and running under eclipse
> for debugging I tried to switch back to slf4j and followed
> the steps of http://wiki.apache.org/solr/SolrLogging
>
> Unfortunately eclipse bothers me with an error:
> The imp
On 5/15/2013 1:57 AM, Toke Eskildsen wrote:
> On Wed, 2013-05-15 at 08:31 +0200, Shawn Heisey wrote:
>> http://wiki.apache.org/solr/SolrPerformanceProblems
>>
>> I really was serious about reading that page, and not just because I
>> wrote it.
>
> That page makes a clear recommendation of RAM over
We have a simple SolrCloud setup (4.1) running with a single shard and
multiple replicas across 3 servers, and it's working fine except once in a
while,
the leader logs this error. We fine-tuned GC among other things and everything
is lightning fast. However, we still receive this SEVERE error a
On Wed, May 15, 2013 at 7:25 AM, sathish_ix wrote:
> Hi , i would like to get all documents when searching for a keyword.
>
> http://localhost:8080/solr/select?q=caram&rows=_val_:"docfreq(SEARCH_TERM,'caram')"
>
> Searching for 'caram', there are 200 documents, but iam getting first 10
> documents
Hi Anshum,
What if you have more nodes than shards*replicationFactor.
In the example below, originally I created the collection to use 6
shards* 2 replicationFactor = 12 nodes total.
Now I added 6 more nodes, 18 nodes total. I just want to add 1 extra
replica per shard.
How will it get evenly
Although technically it may be possible to put 1 billion documents in a
single Solr/Lucene index (2 billion hard limit), I would recommend simply:
Don't do it! Don't try to put more than 250 million documents on a single
Solr node. In fact, 100 million is a better, more realistic limit.
To be
Yeah, I use both on an empty Solr - what is the error?
- Mark
On May 15, 2013, at 6:53 AM, A.Eibner wrote:
> Hi,
>
> I just wanted to ask, if anyone is using the collections API to create
> collections,
> or if not how they use the coreAPI to create a collection with replication ?
>
> Becaus
Hi
I go through that but i want to index multiple location in single
document and a single location have multiple feature/attribute like
country,state,district etc. I want Index and want hierarchical facet
result on facet pivot query. One more thing , my document varies may
have single ,two
Hi , i would like to get all documents when searching for a keyword.
http://localhost:8080/solr/select?q=caram&rows=_val_:"docfreq(SEARCH_TERM,'caram')"
Searching for 'caram', there are 200 documents, but iam getting first 10
documents.
I thought of adding function to the rows.
Can we pass resu
Hello again!
Of course that the part that parses the URL requests must be in Servlet
side. However, if the project is well modularized, that Java classes,
dependencies... whatever, could be re-used inside SolrJ project, so I don´t
think that it would be an enourmous extra work. That´s the magic of
hi all
I want to index 2 separate unrelated tables from database into single solr
core and search in any one of the document separately how can I do it?
please help
thanks in advance
regards
Rohan
Hi,
I just wanted to ask, if anyone is using the collections API to create
collections,
or if not how they use the coreAPI to create a collection with
replication ?
Because I run into errors when creating a collection on an empty solr.
Kind regards
Alexander
Hi,
I filed an issue at https://issues.apache.org/jira/browse/SOLR-4734
I also tried this with 4.3, but the same error occurs.
Should I post on the dev list ?
Kind regards
Alexander
Am 2013-04-16 23:47, schrieb Chris Hostetter:
: sorry for pushing, but I just replayed the steps with solr 4.0
http://wiki.apache.org/solr/HierarchicalFaceting
On Wed, May 15, 2013, at 09:44 AM, varsha.yadav wrote:
> Hi Everyone,
>
> I am working on Hierarchical Faceting. I am indexing location of
> document with their state and district.
> I would like to find counts of every country with state count
On 15 May 2013 15:20, amit wrote:
> I am installing solr on tomcat7 in aws using bitmani tomcat stack.
[...]
If you are using the Bitnami stack, I would direct questions
to their forums. Having used Bitnami for AWS deployments,
we have gone back to rolling our own starting from a standard
Linux s
Thanks Shawn for explaining everything in such detail, it was really helpful.
Have few more queries on the same. Can you please explain the purpose of the
3rd box in minimal configuration, with the standalone zookeeper?
On separate note, if we go with ahead with 4 box(8 shard with replication
f
I am installing solr on tomcat7 in aws using bitmani tomcat stack.My solr
server is not starting; below is the errorINFO: Starting service Catalina
May 15, 2013 7:01:51 AM org.apache.catalina.core.StandardEngine
startInternal INFO: Starting Servlet Engine: Apache Tomcat/7.0.39 May 15,
2013 7:01:
Hi Everyone,
I am working on Hierarchical Faceting. I am indexing location of
document with their state and district.
I would like to find counts of every country with state count and
district count. I found facet pivot working well to give me count if i
use single valued fields like
---
Just on our experiences, we have a large collection (350M documents, but
1.2Tb in size spread across 4 shards/machines and multiple replicas, we may
well need more) and the first thing we needed to do for size estimation was
to work out how big a set number of documents would be on disk. So we did
On Wed, 2013-05-15 at 08:31 +0200, Shawn Heisey wrote:
> http://wiki.apache.org/solr/SolrPerformanceProblems
>
> I really was serious about reading that page, and not just because I
> wrote it.
That page makes a clear recommendation of RAM over SSDs.
Have you done any performance testing on this?
81 matches
Mail list logo