Dear list,
I'm trying to delta-import with datasource FileDataSource and
processor FileListEntityProcessor. I want to load only files
which are newer than dataimport.properties -> last_index_time.
It looks like that newerThan="${dataimport.last_index_time}" is
without any function.
Can it be that
Hi,
Here are the DIH commands that will trigger the indexing and
more: http://wiki.apache.org/solr/DataImportHandler#Commands
Here is how you search with
SolrJ: http://wiki.apache.org/solr/Solrj#Reading_Data_from_Solr
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene e
Hello, I am new to Solr and trying to index some data from Oracle db through
the DataImportHandler. I would like to know as to how I can invoke the
indexing from SolrJ and then how to search the data from SolrJ. Thanks in
advance.
--
View this message in context:
http://lucene.472066.n3.nabble.c
Hi,
Hm, yeah, there is a library - the query parser classes in Lucene/Solr
themselves. As far as I know, you have to try parsing and if there is no
exception, the query is valid.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene
Hi,
Hm, yeah, there is a library - the query parser classes in Lucene/Solr
themselves. As far as I know, you have to try parsing and if there is no
exception, the query is valid.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene
Hi,
Lots of threads on that topic here:
http://search-lucene.com/?q=phrase+query+wildcard&fc_project=Lucene
And if you click that JIRA facet you'll see this as #1
hit: https://issues.apache.org/jira/browse/LUCENE-1486
(note: that's Lucene, not Solr)
Otis
Sematext :: http://sematext.com/ ::
Do you actually want to escape them with \ ?
Look at
this: http://search-lucene.com/?q=escape+query+characters&fc_project=Solr
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
> From: Igor Chud
Hello,
I hope you are not running JBoss just to run Solr - there are simpler
containers
out there, e.g., Jetty.
Do you OOM?
Do things look better if you replicate less often (e.g. every 5 minutes instead
of every 60 seconds)?
Do all/some of those -X__ JVM params actually help?
Otis
Sematex
Have you looked at Solr cores? See:
http://wiki.apache.org/solr/CoreAdmin
maybe you can get away with keeping the separate indexes
Best
Erick
On Thu, Oct 14, 2010 at 11:19 AM, Upayavira wrote:
>
>
> On Thu, 14 Oct 2010 07:51 -0700, "bbarani" wrote:
> >
> > Hi,
> >
> > We are using SOLR to
i would not cross-reference solr results with your database to merge unless you
want to spank your database. nor would i load solr with all your data. what i
have found is that the search results page is generally a small subset of data
relating to the fuller document/result. therefore i store o
: Another question I have is where the processing of this "first letter" is
: more adequate.
: I am considering updating my data import handler to execute a script to
: extract the first letter from the author field.
:
: I saw other thread when someone mentioned using a field analyser to extract
Hello everyone! I am new to Solr and Lucene and I would like to ask
you a couple of questions.
I am working on an existing system that has the data saved in a
Postgre DB and now I am trying to integrate Solr to use full-text
search and faceted search, but I am having a couple of doubts about
it.
Markus Jelsma wrote:
Here's a very recent thread on the matter:
http://lucene.472066.n3.nabble.com/facet-method-enum-vs-fc-td1681277.html
Thanks, that's helpful, but still leaves me with questions.
Yonik suggests with only ~25 unique facet values, method=enum is
probably the way to go.
Here's a very recent thread on the matter:
http://lucene.472066.n3.nabble.com/facet-method-enum-vs-fc-td1681277.html
> Thanks Yonik. I hadn't actually been using "enum" on facets with a
> small number of unique values; the wiki page doesn't give much guidance
> on when each is called for. Do you
Thanks Yonik. I hadn't actually been using "enum" on facets with a
small number of unique values; the wiki page doesn't give much guidance
on when each is called for. Do you have any rule of thumb for how few
unique values is "few enough" to want to use method=enum? Does it
matter if the fie
Thank you for both responses.
Another question I have is where the processing of this "first letter" is
more adequate.
I am considering updating my data import handler to execute a script to
extract the first letter from the author field.
I saw other thread when someone mentioned using a field an
On Thu, Oct 14, 2010 at 3:42 PM, Jonathan Rochkind wrote:
> I believe that should work fine in Solr 1.4.1. Creating a field with just
> first letter of author is definitely the right (possibly only) way to allow
> facetting on first letter of author's name.
>
> I have very voluminous facets (few
I believe that should work fine in Solr 1.4.1. Creating a field with
just first letter of author is definitely the right (possibly only) way
to allow facetting on first letter of author's name.
I have very voluminous facets (few facet values, many docs in each
value) like that in my app too,
you will find it in the distribution at example/solr/config
On Oct 14, 2010, at 3:04 PM, Ibrahim Diop wrote:
> Hi All,
>
> I'm a new solr user and I just want to know which schema.xml file to modify
> for this tutorial : http://lucene.apache.org/solr/tutorial.html
>
> Thanks,
>
> Ibrahim.
tks moysidis
On Thu, Oct 14, 2010 at 3:45 PM, Savvas-Andreas Moysidis <
savvas.andreas.moysi...@googlemail.com> wrote:
> Hi,
>
> yes, Solr does support fuzzy queries by using the Levenstein Distance
> algorithm:
> http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance
>
> You can speci
On 14.10.2010, at 21:02, Yonik Seeley wrote:
> On Thu, Oct 14, 2010 at 2:55 PM, Mike Squire wrote:
>> As pointed out before it would be useful to have some kind of
>> documented road map for development, and some kind of indication of
>> how close certain versions are to release.
>
> Such thing
Hi All,
I'm a new solr user and I just want to know which schema.xml file to
modify for this tutorial : http://lucene.apache.org/solr/tutorial.html
Thanks,
Ibrahim.
On Thu, Oct 14, 2010 at 2:55 PM, Mike Squire wrote:
> As pointed out before it would be useful to have some kind of
> documented road map for development, and some kind of indication of
> how close certain versions are to release.
Such things have proven to be very unreliable in the past, due to
Hi,
Thank you all for your quick response. Just to clarify I take it for
my particular problem (taking advantage of the spatial search
functionality) my best option is 3.1 and that should be reasonably
stable?
As pointed out before it would be useful to have some kind of
documented road map for d
Guys,
We have a website running Solr indexing books, and we use a facet to filter
books by author.
After some time, we detected that this facet is very large and we need to
create some other feature to help finding the information.
Our product team asked to create a page that can show all authors
On 14.10.2010, at 20:46, Yonik Seeley wrote:
> On Thu, Oct 14, 2010 at 2:39 PM, Jonathan Rochkind wrote:
>> Thanks Yonik! So I gather that the 1.5 branch has essentially been
>> abandoned, we can pretend it doesn't exist at all, it's been entirely
>> superceded by the 3.x branch, with the chang
Hi,
yes, Solr does support fuzzy queries by using the Levenstein Distance
algorithm: http://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance
You can specify a fuzzy query by adding a tilde (~) symbol at the end of
your query as in title: Solr~
You can even specify a proximity threshold
On Thu, Oct 14, 2010 at 2:39 PM, Jonathan Rochkind wrote:
> Thanks Yonik! So I gather that the 1.5 branch has essentially been
> abandoned, we can pretend it doesn't exist at all, it's been entirely
> superceded by the 3.x branch, with the changes made just for the purposes of
> syncronizing vers
Thanks Yonik! So I gather that the 1.5 branch has essentially been
abandoned, we can pretend it doesn't exist at all, it's been entirely
superceded by the 3.x branch, with the changes made just for the
purposes of syncronizing versions with lucene.
Yonik Seeley wrote:
On Thu, Oct 14, 2010 at
Hi people,
Somebody knows if solr have the fuzzy funcionality?
Tks
--
Claudio Devecchi
On Thu, Oct 14, 2010 at 1:50 PM, Jonathan Rochkind wrote:
> I'm kind of confused about Solr development plans in general, highlighted by
> this thread.
>
> I think 1.4.1 is the latest officially stable release, yes?
>
> Why is there both a 1.5 and a 3.x, anyway? Not to mention a 4.x? Which of
>
On Thu, Oct 14, 2010 at 1:58 PM, Lukas Kahwe Smith wrote:
> the current confusing list of branches is a result of the merge of the lucene
> and solr svn repositories. what baffpes me is that so far the countless
> plea's for at least a rough roadmap or even just explanation for why so many
> br
On 14.10.2010, at 19:50, Jonathan Rochkind wrote:
> I'm kind of confused about Solr development plans in general, highlighted by
> this thread.
>
> I think 1.4.1 is the latest officially stable release, yes?
>
> Why is there both a 1.5 and a 3.x, anyway? Not to mention a 4.x? Which of
> the
The point/use-case of sharding/distributed search is for performance,
not for segregating different data in different places. Distributed
search assumes the same schema in each shard -- do you have that?
I don't think distributed search means to support the kind of "joining"
you describe, that
Ken,
Ok, I understand how the distributed search works, but I don't understand how
to build my query appropriately so that the results returned from the two
shards only return values that exist in both result sets.
In essence, I'm doing a join across the two shards on the resourceId.
So Cor
I'm kind of confused about Solr development plans in general,
highlighted by this thread.
I think 1.4.1 is the latest officially stable release, yes?
Why is there both a 1.5 and a 3.x, anyway? Not to mention a 4.x? Which
of these will end up being a stable release? Both? From which will come
The devs try and keep both the 3 and 4 (trunk) branches stable in the
terms you are talking about at all times. But bear in mind that more
radical changes will tend to hit trunk, probably making it by definition
less stable than 3. But it all depends - you might find a worse bug on
the 3 branch!
A
I forgot a few important details:
solr version = 1.4.1
current index size = 50gb
growth ~600mb / day
jboss runs with web settings (same as minimal)
2010/10/14
> Hi,
>
> as I am new here, I want to say hello and thanks in advance for your help.
>
>
> HW Setup:
>
> 1x SOLR Master - Sun Microsystem
Hi,
I've successfully downloaded and deployed 1.4.1, which is fine except it
doesn't support the spatial search stuff. I tried installing LocalSolr but
came to a bit of an impasse when it appeared to index stuff but didn't
return any results (and then I saw the last commit to the LocalSolr
reposit
Steve,
Using shards is actually quite simple; it's just a matter of setting up your
shards (via multiple cores, or multiple instances of SOLR) and then passing
the shards parameter in the query string. The shards parameter is a
comma-separated list of the servers/cores you wish to use together.
S
Hi,
as I am new here, I want to say hello and thanks in advance for your help.
HW Setup:
1x SOLR Master - Sun Microsystems SUN FIRE X4450 - 4 x 2,93ghz, 64gb ram
1x SOLR Slave - Sun Microsystems SUN FIRE X4450 - 4 x 2,93ghz, 64gb ram
SW Setup:
Solaris 10 Generic_142901-03
jboss 5.1.0
JDK 1.6
Dear solr-user folks,
I would like to use the stats module to perform very basic statistics
(mean, min and max) which is actually working just fine.
Nethertheless I found a little limitation that bothers me a tiny bit :
how to perform the exact same statistics, but on the result of a
function que
On Thu, 14 Oct 2010 07:51 -0700, "bbarani" wrote:
>
> Hi,
>
> We are using SOLR to index data from DB / XML. There is one more
> application
> which uses SOLR for indexing the data.
>
> SOLR instance 1 --> index using DB / XML
> SOLR instance 2 --> Index created using an application.
>
> Bo
Hi,
I have a very simple question about indexing an existing index.
We have 2 index, index 1 is being maintained by us (it indexes the data from
a database) and we have an index 2 which is maintaing by a tool..
Both the schemas are totally different but we are interested to re-index the
index p
Hi,
We are using SOLR to index data from DB / XML. There is one more application
which uses SOLR for indexing the data.
SOLR instance 1 --> index using DB / XML
SOLR instance 2 --> Index created using an application.
Both the schema files are different.
My questions is that is there a way to
me also. great book, just wanted a bit more on complex DIH :)
On Oct 14, 2010, at 10:38 AM, Jason Brown wrote:
> Not related to the opening thread - but wante to thank Eric for his book.
> Clarified a lot of stuff and very useful.
>
>
> -Original Message-
> From: Eric Pugh [mailto:ep..
Not related to the opening thread - but wante to thank Eric for his book.
Clarified a lot of stuff and very useful.
-Original Message-
From: Eric Pugh [mailto:ep...@opensourceconnections.com]
Sent: Thu 14/10/2010 15:34
To: solr-user@lucene.apache.org
Subject: Re: What is the maximum numb
I would recommend looking at the work the HathiTrust has done. They have
published some really great blog articles about the work they have done in
scaling Solr, and have put in huge amounts of data.
The good news is that there isn't a exact number, because "It depends". The
bad news is t
If I understand your problem right what you probably need is to escape those
characters: http://lucene.apache.org/java/2_9_1/queryparsersyntax.html#Escaping
Special Characters
On 14 October 2010 14:36, Igor Chudov wrote:
> Let's say that I submit a query for a MoreLikeThis search. The query
> co
Nice! :)
No further questions SIR! ;)
Thanks!
--
View this message in context:
http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1701120.html
Sent from the Solr - User mailing list archive at Nabble.com.
correct, it show the transformations that happen to your indexed term (or
query term if you use the *Field value (query)* box ) after each
Tokenizer/Filter is executed.
On 14 October 2010 14:40, PeterKerk wrote:
>
> Awesome again!
>
> And for my understanding, I type a single word "Boston" and t
Awesome again!
And for my understanding, I type a single word "Boston" and then I see 7
lines of output:
Boston
Boston
Boston
Boston
boston
boston
boston
So each line represents what is done to the query value after it has passed
through the filter?
--
View this message in context:
http://luc
Let's say that I submit a query for a MoreLikeThis search. The query
contains special characters, that Solr/Lucene interprets specially,
such as colon ":".
Example textual query is "Solve a proportion X:2 = 4/5 and find X".
(the context is website algebra.com).
My queries never intend those chara
yep, the Solr Admin web-app provides functionality that does exactly
that..it can reached@ http://
{serverName}:{serverPort}/solr/admin/analysis.jsp
On 14 October 2010 14:28, PeterKerk wrote:
>
> It DOES work :)
>
> Oh and on the filtersis there some sort of debug/overview tool to see
> what
It DOES work :)
Oh and on the filtersis there some sort of debug/overview tool to see
what each filter does and what an input string look like after going through
a filter?
--
View this message in context:
http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-o
I think this should work..It might also be a good idea to investigate how
exactly each filter in the chain modifies your original text..this way you
will be able to better understand why certain queries match certain
documents.
On 14 October 2010 14:18, PeterKerk wrote:
>
> Correct, thanks!
>
>
Correct, thanks!
I have used the following:
--
View this message in context:
http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-
verbatim from schema.xml:
" "
so basically what this means is that when you index "Hello there mate" the
only text that is indexed and therefore searchable is the exact
phrase "Hello there mate" and *not* the terms Hello - there - mate.
What you need is a solr.TextField based type which splits (
This is the definition
--
View this message in context:
http://lucene.472066.n3.nabble.com/check-if-field-CONTAINS-a-value-as-opposed-to-IS-of-a-value-tp1700495p1700893.html
Sent from the Solr - User mailing list archive at Nabble.com.
looks like you are not tokenizing your field properly. What does your
schema.xml look like?
On 14 October 2010 13:01, Allistair Crossley wrote:
> actuall no you don't .. if you want hi in a sentence of hi there this is me
> this is just normal tokenizing and should work .. check your field
> typ
Marco,
There are many factors that make this a difficult question to answer.
How many terms exist in those documents, how many fields, etc. You'll
only likely find out the exact parameters for a single Solr instance by
actually trying it with your own data.
Having said that, you can break down yo
super
On Oct 14, 2010, at 8:00 AM, Anthony Maudry wrote:
> Sorry for the late answer.
>
> It works now thanks to you, Allistair.
>
> I needed to use your "uid" field, common to the two entities but built in
> different ways.
>
> here is the result in a sample of the data-config.xml file
>
>
i think you answered the question by yourself ... these questions usually get
the response that there is no answer. solr/lucence scale and distribute to
whatever hardware you want to throw them.
you probably want to turn the question around - what is the maximum number of
documents that your sy
actuall no you don't .. if you want hi in a sentence of hi there this is me
this is just normal tokenizing and should work .. check your field
type/analysers
On Oct 14, 2010, at 7:59 AM, Allistair Crossley wrote:
> i think you need to look at ngram tokenizing
>
> On Oct 14, 2010, at 7:55 AM, P
Hi all,
I am working on a performance specification document on a Solr/Lucene-based
application; this document is intended for the final customer. My question
is: what is the maximum number of document I can index assuming 10 or
20kbytes for each document?
I could not find a precise answer to this
Sorry for the late answer.
It works now thanks to you, Allistair.
I needed to use your "uid" field, common to the two entities but built
in different ways.
here is the result in a sample of the data-config.xml file
...
...
...
...
uid is define as unique
i think you need to look at ngram tokenizing
On Oct 14, 2010, at 7:55 AM, PeterKerk wrote:
>
> I try to determine if a certain word occurs within a field.
>
> http://localhost:8983/solr/db/select/?indent=on&facet=true&fl=id,title&q=introtext:hi
>
> this works if an EXACT match was found on fie
I try to determine if a certain word occurs within a field.
http://localhost:8983/solr/db/select/?indent=on&facet=true&fl=id,title&q=introtext:hi
this works if an EXACT match was found on field introtext, thus the field
value is just "hi"
But if the field value woud be "hi there, this is just s
that's not the correct lib-dir. thats the lib-dir for jetty. please check you
installation.
if you use the tgz from lucene.apache.org/solr it should look like (we have
added a few additonal jars)-
thats all of the jars in our solr-dir:
./contrib/clustering/lib/carrot2-mini-3.1.0.jar
./contrib/c
Hi,
Have applied SOLR-1553 to 1.4.2 and it works great.
However, I can't get the pf param to work. Example:
q=foo bar&qf=title^2.0 body^0.5&pf=title^50.0
Shouldn't I see the phrase query boost in debugQuery? Currently I see no trace
of pf being used.
--
Jan Høydahl, search solution architect
I will be out of the office starting 14/10/2010 and will not return until
26/10/2010.
Please email to itsta...@actionimages.com for any urgent issues.
Action Images are proud to be an Official Supplier to England 2018 -
www.england2018bid.com
Action Images is a division of Reuters Limited a
Queries such as:
Cat AND
(i.e the query is malformed with no second term provided)
Causes a ParseException.
Of course I could parse the query for sanity before it is submitted to Solr
but I wondered if there is a good practice way of checking/dealing with
queries which are incomplete? Perha
results from both tables with 1 search - your first suggestion with separate
entities under document is right, or at least how i do it. things that i have
often found ...
0. check stdout for SQL errors
1. verify that your SQL works when you run it direct on your database!
2. verify that your sea
Thanks for your quick answer.
Actually I need to get result from both tables from a single search.
I tried to define correctly every fields as you told me in your previous
message but I only get result from one table (actualy "Newsfeeds")
Le 14/10/2010 11:49, Allistair Crossley a écrit :
a
actually your intention is unclear ... are you wanting to run a single search
and get back results from BOTH newsfeed and message? or do you want one or the
other? if you want one or the other you could use my strategy which is to store
the entity type as a field when indexing, e.g.
note, i
your first example is correct
i have the same config for indexing 5 different tables
what you don't have from what i can see is a field name mapped to each column,
e.g.
i always have to provide the destination field in schema.xml, e.g.
On Oct 14, 2010, at 5:22 AM, Anthony Maudry wrote:
Hi everyone,
I'm trying to write some code for creating and using multi cores.
Is there a method available for this purpose or do I have to do a HTTP
to a URL such as
http://localhost:8983/solr/admin/cores?action=STATUS&core=core0
Is there an API available for this purpose. For example, if I wan
Hello,
I'm using Solr with a postgreSQL database. I need to search across two
tables with no link between them.
ie : I have got a "messages" table and a "newsfeeds" table, nothing
liking them.
I tried to configure my data-config.xml to implement this but it seems
that tables can't be defi
just a blind shot (didn't read the full thread):
what is your maxWarmingSearchers settings? For large indices we set it
to 2 (maximum)
Regards,
Peter.
> just update on this issue...
>
> we turned off the new/first searchers (upgrade to Solr 1.4.1), and ran
> benchmark tests, there is no noticeabl
Ken,
I have been through that page many times. I could use Distributed search for
what? The first scenario or the second?
The question is: can I merge a set of results from the two cores/shards and
only return results that exist in both (determined by the resourceId, which
exists on both)?
Ch
80 matches
Mail list logo