I think this is what I'm looking for. What is the status of this patch?
On Thu, Sep 3, 2009 at 12:00 PM, R. Tan wrote:
> Hi Solrers,
> I would like to get your opinion on how to best approach a search
> requirement that I have. The scenario is I have a set of business listings
> that may be grou
unfortunately DIH is not yet integrated with ExtractingRequestHandler .
see this https://issues.apache.org/jira/browse/SOLR-1358
On Thu, Sep 3, 2009 at 5:34 AM, Khai Doan wrote:
> Hi all,
>
> My name is Khai. I have a table in a relational database. I have
> successfully use DataImportHandler
Hi Solrers,
I would like to get your opinion on how to best approach a search
requirement that I have. The scenario is I have a set of business listings
that may be group into one parent business (such as 7-eleven having several
locations). On the results page, I only want 7-eleven to show up once
Sorry for the duplicate post, if ever, can anyone share their experience on
holding facet heading/value IDs in Solr?
On Fri, Aug 28, 2009 at 3:27 AM, Rihaed Tan wrote:
> Hi,
>
> I have a similar requirement to Matthew (from his post 2 years ago). Is
> this still the way to go in storing both the
Spaces, if there in a term query, should be escaped before searching.
myField:quick\ brown\ fox
is the correct way to search for quick brown fox in myField using TermQuery.
You can always add &debugQuery=on your search url. The response will contain
a lot of helpful information on how the incomi
rajan chandi wrote:
Use Apache Luke.
If you're using new Lucene. You might need to add Lucene 2.9 Jar files to
the Luke and build it.
Just an FYI. Luke can be launched by ant at the solr install directory:
$ ant luke
Koji
Cheers
Rajan
On Wed, Sep 2, 2009 at 2:02 PM, Jason Rutherglen
I think I understand what happened.
The query "+specific_LIST_s:For Sale" is processed and broken into "For" and
"Sale". The specific_LIST_s field is a "string", so it is not tokenized, but
remains indexed as "For Sale", which matches neither "For" nor "Sale". Hence,
no results.
This que
Hi all,
My name is Khai. I have a table in a relational database. I have
successfully use DataImportHandler to import this data into Apache Solr.
However, one of the column store the location of PDF file. How can I
configure DataImportHandler to use ExtractingRequestHandler to extract the
conte
For HDFS, failover, sharding you may want to use Solr with Katta.
There's an issue open at:
http://issues.apache.org/jira/browse/SOLR-1301
Near realtime search needs to be added incrementally to Solr. Today I
wouldn't recommend it.
On Wed, Sep 2, 2009 at 10:14 AM, Zhenyu Zhong wrote:
> Dear all,
On Wednesday 02 September 2009 15:15:42 Adam Allgaier wrote:
> Touch gently with the Solr newbieI've searched trying to find an answer
> to this problem with no success. I'm sure it's something small and easy.
>
> I'm using Solr 1.3 with Solrj client
>
> omitNorms="true"/>
> ...
>
>
>
hello *, what would be the best approach to return the sum of boosts
as the score?
ex:
a dismax handler boosts matches to field1^100 and field2^50, a query
matches both fields hence the score for that row would be 150
is this something i could do with a function query or do i need to
hack up Di
On Wednesday 02 September 2009 16:37:03 Gérard Dupont wrote:
> >
> > Yes, it does - thanks!
> > Back to translating legacy search queries into Solr search queries. :)
> > -Dan
> >
>
> Just curious : what legacy system is it ?
Sorry, but at the moment - I don't think I'm at liberty to say
>
> Yes, it does - thanks!
> Back to translating legacy search queries into Solr search queries. :)
> -Dan
>
Just curious : what legacy system is it ?
On Wednesday 02 September 2009 16:00:55 Gérard Dupont wrote:
> Hi Dan,
>
> Phrase search (ie using quote) in Lucene does exact match or your expression
> so if you type ["david pdf"] (brackets are there to limit the query in my
> mail only) the system search for a document that contain the term 'd
Is "pdf" inside the file or part of the file name?
What legacy system? I've helped write a couple of them. Some systems,
like Ultraseek, add parts of the filename as searchable text.
wunder
On Sep 2, 2009, at 1:49 PM, Dan A. Dickey wrote:
I'm having a problem with doing a phrase search of
Hi Dan,
Phrase search (ie using quote) in Lucene does exact match or your expression
so if you type ["david pdf"] (brackets are there to limit the query in my
mail only) the system search for a document that contain the term 'david'
and the term 'pdf' separated by a space (well in the classic case
I'm having a problem with doing a phrase search of "david pdf".
When I search for just "david", I get 7 hits. When I search for "pdf"
I get 73 hits. On a legacy system, searching for "david pdf" I get
78 hits. And on Solr (1.4 - one of the nightly builds) - when searching
for "david pdf" I get 0
Touch gently with the Solr newbieI've searched trying to find an answer to
this problem with no success. I'm sure it's something small and easy.
I'm using Solr 1.3 with Solrj client
...
I am indexing the "specific_LIST_s" with the value "For Sale".
The document indexes just fine. A qu
http://www.entropy.ch/software/MacOSX/xmlviewplugin/
Lucas Frare Teixeira .·.
- lucas...@gmail.com
- blog.lucastex.com
- twitter.com/lucastex
On Wed, Sep 2, 2009 at 3:28 PM, Paul Tomblin wrote:
> Slightly off topic, but I'm getting tired of hitting the 'view source'
> keyboard shortcut every t
Slightly off topic, but I'm getting tired of hitting the 'view source' keyboard
shortcut every time I do a solr query. Is there a way to make Safari display
xml as-is?
-- Sent from my Palm Prē
I would recommend using the IndexReader class.
That could be the fastest possible :)
Cheers
Rajan
On Wed, Sep 2, 2009 at 2:22 PM, Jason Rutherglen wrote:
> I needed to mention through the web UI. Solr Luke takes ages to load.
>
> On Wed, Sep 2, 2009 at 11:05 AM, rajan chandi
> wrote:
> > Use A
I needed to mention through the web UI. Solr Luke takes ages to load.
On Wed, Sep 2, 2009 at 11:05 AM, rajan chandi wrote:
> Use Apache Luke.
>
> If you're using new Lucene. You might need to add Lucene 2.9 Jar files to
> the Luke and build it.
>
> Cheers
> Rajan
>
>
> On Wed, Sep 2, 2009 at 2:02
Use Apache Luke.
If you're using new Lucene. You might need to add Lucene 2.9 Jar files to
the Luke and build it.
Cheers
Rajan
On Wed, Sep 2, 2009 at 2:02 PM, Jason Rutherglen wrote:
> Is there a quick way to view index files?
>
Is there a quick way to view index files?
Great Thanks Aakash for your inputs!
We'll try to do some research and possibly bench-marks before we move
forward.
Regards
Rajan
On Wed, Sep 2, 2009 at 1:27 PM, Aakash Dharmadhikari wrote:
> hi Rajan,
>
> More knowledgeable people might be able to provide better insight into
> the performance
Hi Angel,
I'm looking into it. Might need a new SolrRequest, but still playing
around and will let you know...
-Grant
On Sep 2, 2009, at 4:56 AM, Angel Ice wrote:
Hi everybody.
I hope it's the right place for questions, if not sorry.
I'm trying to index rich documents (PDF, MS docs etc)
hi Rajan,
More knowledgeable people might be able to provide better insight into
the performance issues, but I have a doubt around this ORing business.
The best option I see is storing all my friends IDs in my documents as
multi valued field. This in contrast to OR queries would make queryin
Thanks very much!
I suppose I’m still very dummy in Solr, I was supposting I could do it
directly.
I did what you said and it seems to work perfectly!
*public* *class* PolishStemFilterFactory *extends* BaseTokenFilterFactory {
*public* StempelFilter create(TokenStream in) {
*
: solr.DateField compatible format. I wrote a new
: definition inside the solrconfig.xml, which creates
: eg. 1991-01-01T00:00:01Z from the input '[c1991.]' string.
is only supported when the class of the is
TextField ... it would be nice if it worked with any other field type (i
think it wou
Dear all,
I am very interested in Solr and would like to deploy Solr for distributed
indexing and searching. I hope you are the right Solr expert who can help me
out.
However, I have concerns about the scalability and management overhead of
Solr. I am wondering if anyone could give me some guidanc
> Just execute 20 SQL queries with filters
> Same with SOLR vs. Lucene, standard Lucene queries "filter1:value1 AND
> facet2:value2" ... "filter1:value1 AND facet2:value99" are functionally
the
> same as SOLR faceting (99 docset intersections in RAM) and (sooner or
later)
> implementation
> This article explains in-depth why calculating facets is not practical in
> pure SQL: http://www.kimbly.com/blog/000239.html
-> "The problem is that SQL isn't really capable of expressing set
intersections."
But this article is not applicable to described use case: we are
_faceting_on_filtered_
Hi gwk,
Thanks for reply!
Yes, SOLR gives out-of-the-box
- indexes
- implicit data normalization
- fault-tolerance, replication, scalability
- performance
(so that we can save _huge_ money & time)
But from just an engineering viewpoint, forgetting cost&time, SELECT
COUNT(*) ... WHERE ... seems
I have not used these APIs but Actually, You don't need CURL to POST the
document to Solr.
You can execute an HTTP POST using only Java.
http://www.jguru.com/faq/view.jsp?EID=62798
You might want to look at SolrInputDocument.
No matter what mechanism you may use to post the document. The point i
On Wed, Sep 2, 2009 at 8:10 PM, David Espinosa wrote:
> My problem appears when I try to create a Polish stemmed index. There isn’t
> a Snowball implementation for Polish, but I found a lucene one:
>
> http://www.getopt.org/stempel/index.html#distrib
>
> I included the jar into Solr lib folder a
Hi Rajan.
As mentioned in my message, I don't want tu use Curl to post documents and
can't use an HTTP POST (the document has already been posted to my JEE webapp
for other purposes). All I can use is just java.
In fact, I'd like the user to post the document to my webapp with an HTML POST
(it
Hi,
I’m developing a multi language Solr index, where I have a single core for
each one. I use SnowballPorterFilterFactory for German, French and Italian
languages with excellent results.
My problem appears when I try to create a Polish stemmed index. There isn’t
a Snowball implementation for Po
Laurent,
Check-out Solr 1.4.
You can download the trunk and Build it on your box.
The Solr 1.4 does this out-of-the-box. No configuration required.
You can use HTTP POST to post the document using some Linux utility like
Curl and the PDF/Word/RTF/PPT/XLS etc. will be indexed. We tested this las
Hi,
the exception I received:
SEVERE: org.apache.solr.common.SolrException: Error while creating field
'date_df{type=trickyDate,properties=indexed,stored,omitNorms,omitTf,multiValued,sortMissingLast}'
from value 'c1991.'
at org.apache.solr.schema.FieldType.createField(FieldType.java:19
What's the exception?
On Sep 2, 2009, at 3:00 AM, Peter Kiraly wrote:
Hi Solr users,
I have a lots of dates from a library catalog in not
solr.DateField compatible format. I wrote a new
definition inside the solrconfig.xml, which creates
eg. 1991-01-01T00:00:01Z from the input '[c1991.]' stri
This article explains in-depth why calculating facets is not practical in
pure SQL: http://www.kimbly.com/blog/000239.html
Cheers,
Mauricio
On Wed, Sep 2, 2009 at 5:30 AM, gwk wrote:
> Fuad Efendi wrote:
>
>> "No results found for 'surface area 377', displaying all properties."
>> - why do we n
Hi everybody.
I hope it's the right place for questions, if not sorry.
I'm trying to index rich documents (PDF, MS docs etc) in SolR/Lucene.
I have seen a few examples explaining how to use tika to solve this. But most
of these examples are using curl to send documents to Solr or an HTML POST wi
Thank you Birger for the pointer to HBase.
HBase sounds interesting. We will consider this for - "people you may
know".
We are trying to address a different problem of searching from a well
defined list of contacts.
A huge ORed query sounds good at this point as a solution.
Thanks and regards
R
Hi Gwk,
Thanks for the pointers.
The only concern will be the relevance.
Lucene has the best relevance capability so far. CouchDB sounds to be
interesting though.
May be We'll try to find some bench-marks on relevance score of CouchDB.
Thanks and Regards
Rajan Chandi
On Wed, Sep 2, 2009 at 4:04
HI,
I might be unclear in what I mean.
Usually people have friends in common, so if you
1) create and store a relationship between user x and y, and give that
an id.
2) x knows z than there is a probability that y might know z as well.
If that is the case than add z to the relation and you d
Hello Rajan,
I might be mistaken, but isn't CouchDB or a similar map/reduce database
ideal for situations like this?
Regards,
gwk
rajan chandi wrote:
Hi All,
We are dealing with a very complex problem of person specific search.
We're building a social network where people will post stuff
Gerald and Birger, Thank your for your quick responses.
In our situation, Users will tend to upload more than finding new friends.
We are currently considering doing the ORing or the contacts on the fly as
part of the search query.
Please correct me if I am wrong but here is what I understand fr
Hi,
If you store all mutual relations in a database, a lot of the relations will
overlap. This is easily done using distinct clauses in sql. Use the overlapped
values as tags on documents. That way you gain tremendous performance in search
time, Obviously updating documents are a performance los
Hi Solr users,
I have a lots of dates from a library catalog in not
solr.DateField compatible format. I wrote a new
definition inside the solrconfig.xml, which creates
eg. 1991-01-01T00:00:01Z from the input '[c1991.]' string.
It works fine when I tried it with the typical values
in the http://l
Hi,
The big OR query should be the easiest way and it may work up to ~1000 users
(ie you can specific by default 1024 boolean clause so up to N users in the
OR where N = 1024 - (boolean clause in your query)). You can increase this
limit of boolean clauses in the configuration but I guess too much
Hi All,
We are dealing with a very complex problem of person specific search.
We're building a social network where people will post stuff and other users
should be able to see the content only from their contacts.
e.g. There are 10,000 users in the system and there are only 150 users in my
netw
On Wed, Sep 2, 2009 at 12:44 AM, Chris
Hostetter wrote:
> : The wiki says "As of Solr 1.3, the DisMaxRequestHandler is simply the
> : standard request handler with the default query parser set to the
> : DisMax Query Parser (defType=dismax).". I just made a checkout of svn
> : and dismax doesn't se
Hi there,
i need to log solr requests on the fly , filter, transform them and finally
put them into an index.
Any advice on best way to implement such this behaviour?
Key points:
- I think that the use of log files is discouraged, but i don't know if i
can modify solr settings to log to a serv
Fuad Efendi wrote:
"No results found for 'surface area 377', displaying all properties."
- why do we need SOLR then...
Hi Fuad,
The search box is only used for geographical search, i.e.
country/region/city searches. The watermark on the homepage indicates
this but the "search again" box
Chris Hostetter wrote:
: When I added numerical faceting to my checkout of solr (solr-1240) I basically
: copied date faceting and modified it to work with numbers instead of dates.
: With numbers I got a lot of doulbe-counted values as well. So to fix my
: problem I added an extra parameter to n
55 matches
Mail list logo